Smaragdov, M. G; Kudinov, A. A
(BioMed Central, 2020)
Abstract
Background
Due to the advent of SNP array technology, a genome-wide analysis of genetic differences between populations and breeds has become possible at a previously unattainable level. The Wright’s fixation index (Fst) and the principal component analysis (PCA) are widely used methods in animal genetics studies. In paper we compared the power of these methods, their complementing each other and which of them is the most powerful.
Results
Comparative analysis of the power Principal Components Analysis (PCA) and Fst were carried out to reveal genetic differences between herds of Holsteinized cows. Totally, 803 BovineSNP50 genotypes of cows from 13 herds were used in current study. Obtained Fst values were in the range of 0.002–0.012 (mean 0.0049) while for rare SNPs with MAF 0.0001–0.005 they were even smaller in the range of 0.001–0.01 (mean 0.0027). Genetic relatedness of the cows in the herds was the cause of such small Fst values. The contribution of rare alleles with MAF 0.0001–0.01 to the Fst values was much less than common alleles and this effect depends on linkage disequilibrium (LD). Despite of substantial change in the MAF spectrum and the number of SNPs we observed small effect size of LD - based pruning on Fst data. PCA analysis confirmed the mutual admixture and small genetic difference between herds. Moreover, PCA analysis of the herds based on the visualization the results of a single eigenvector cannot be used to significantly differentiate herds. Only summed eigenvectors should be used to realize full power of PCA to differentiate small between herds genetic difference. Finally, we presented evidences that the significance of Fst data far exceeds the significance of PCA data when these methods are used to reveal genetic differences between herds.
Conclusions
LD - based pruning had a small effect on findings of Fst and PCA analyzes. Therefore, for weakly structured populations the LD - based pruning is not effective. In addition, our results show that the significance of genetic differences between herds obtained by Fst analysis exceeds the values of PCA. Proposed, to differentiate herds or low structured populations we recommend primarily using the Fst approach and only then PCA.