Predicting productive, health, and reproductive traits in Mexican Holstein cattle using single nucleotide polymorphisms, haplotypes, and runs of homozygosity
José G. Cortes-Hernández, Guillermo Martinez-Boggio, Francisco Peñagaricano, Hugo H. Montaldo, Felipe J. Ruiz-López, Adriana García-Ruiz

TL;DR
This study explores how genomic data can improve predictions of productivity, health, and reproduction in Mexican Holstein cattle.
Contribution
The study compares the effectiveness of SNPs, haplotypes, and ROH in predicting traits and finds that ROH provides higher genetic variance estimates.
Findings
Using ROH increases the estimation of genetic variance for health and reproductive traits.
SNPs provided the highest predictive correlation for traits like milk yield and somatic cell score.
Combining SNPs, HAP, and ROH did not improve predictive performance over using SNPs alone.
Abstract
Summary: We investigated the use of different sources of genomic information, such as single nucleotide polymorphisms (SNPs), haplotypes (HAP), and runs of homozygosity (ROH), to predict productive, health, and reproductive traits in Mexican Holstein cattle. The use of models including genomic information from DNA markers increased the predictive performance and yielded higher estimates of genetic variance compared with models that included only fixed effects of complex traits in Mexican Holstein cattle. Summary: We investigated the use of different sources of genomic information, such as single nucleotide polymorphisms (SNPs), haplotypes (HAP), and runs of homozygosity (ROH), to predict productive, health, and reproductive traits in Mexican Holstein cattle. The use of models including genomic information from DNA markers increased the predictive performance and yielded higher…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic and phenotypic traits in livestock · Genetic Mapping and Diversity in Plants and Animals · Reproductive Physiology in Livestock
High-throughput technologies have revolutionized dairy cattle genetics research, enabling the generation of multiple sources of genomic information, including SNPs, HAP, and ROH, as well as multi-omics data, such as genomics, transcriptomics, metabolomics, and proteomics. Their combination with phenotypic data such as production, health, and reproductive traits allows us to improve dairy cattle performance and informing breeding decisions (Zhu et al., 2022).
Health and reproductive traits are complex and often challenging to measure consistently on farms. Nonetheless, they are critical to the profitability of dairy farms. For example, SCS is a key indicator of udder health and milk quality, whereas days open is a measure of the interval between calving and subsequent conception, which directly affects reproductive efficiency, voluntary culling rates, and replacement costs (Temesgen et al., 2022; Ashja et al., 2024).
The use of multiple sources of genomic information is expected to increase the prediction accuracy of complex traits. These sources may capture various signals affecting phenotypes and thus could be combined to improve prediction performance. Previous research has shown the use of multiple sources of information, including SNP markers, HAP, and others omics data, increases the prediction of complex traits, such as fertility, milk composition, and feed efficiency (Baba et al., 2021; Martinez Boggio et al., 2024).
The use of an additive genetic relationship matrix based on SNPs is commonly used in genetic studies. Therefore, the integration of SNP, HAP, and ROH, individually (Ferdosi et al., 2016; Karimi et al., 2018) or combined, may represent an alternative strategy for enhancing phenotypic prediction, due to each genetic structure providing complementary information. For instance, HAP could identify genomic regions associated with specific functional traits and reveal genomic regions favored by natural or artificial selection that harbor genes relevant to the breed (Chen et al., 2018) and the use a genomic relationship matrix based on HAP, which could potentially capture interactions or information about genetic determinants with small effects that are not captured by individual SNPs (Teissier et al., 2020). The ROH are continuous segments of homozygous genotypes; these segments arise because the HAP were transmitted by a common ancestor sometimes influenced by genetic selection (Zhao et al., 2021) and can be used to build genomic matrices based on its presence or lack of presence. Likewise, ROH can be used to quantify the level of inbreeding and to infer aspects of a population's demographic and selection history, including past bottlenecks, inbreeding events, and selection pressures, based on their length and frequency across the genome (Zayas et al., 2024). Moreover, the amount of ROH segments increases the probability that deleterious recessive alleles become homozygous, potentially leading to reduced fertility, lower vigor, and greater disease susceptibility in cattle (Bhati et al., 2020). The ROH have been associated with productive, health, and reproductive traits in dairy cattle (Cortes-Hernández et al., 2022; Pacheco et al., 2023). Other studies in beef cattle have identified ROH islands on chromosomes 1, 3, 8, 11, 13, 14, and 16 where SNPs have been reported to be associated with growth, reproductive, milk production, residual feed intake, and meat quality traits (Zayas et al., 2024). Despite the use of ROH in association with economic traits, to the best of our knowledge, few studies have used them to perform phenotypic prediction in dairy cattle (Luan et al., 2014).
The objective of this study was to assess the phenotypic prediction of milk yield, SCS, and days open, by integrating SNPs, HAP, and ROH in Mexican Holstein cows. We hypothesized that the use of multiple sources of genomic information could enhance phenotypic prediction, especially when HAP and ROH capture variation that SNP data do not explain. We also evaluated the proportion of genetic variance explained by each of the sources of genomic information considered separately or in combination.
Data consisted of phenotypes and genotypes from 5,746 first-lactation Mexican Holstein cows. Cows were born between 2006 and 2016 from 271 herds. We evaluated 3 different phenotypic traits: milk yield as 305-ME (mature equivalent), SCS expressed as log_2_ (SCC/1,000,000) + 3, and days open defined as the number of days between calving and conception. Quality control consisted of including cows between 16 to 60 mo of age at first calving, and range of ±3 SD was established for milk yield, and a range of 30 to 400 d for days open. Only animals with all genomic and phenotype information were included. All cows had imputed genotype with 116,204 SNPs coming from different density commercial arrays (Illumina, Neogen, and Axiom): BovineLD v2.0 9K (4.15%), GGP Bovine LD v3.0 26K (5.91%), GGP Super LD v4.0 26K (1.38%), BovineSNP v3 50K (9.16%), GGP LD 77K (15.42%), GGP Bovine 100K (0.03%), GGP HD 150K (59.77%), Genome-Wide BOS 1 Bovine Array 640K (2.47%), and GGP HD 777K (1.71%), which were imputed using Findhap V2 (VanRaden et al., 2011; García-Ruiz et al., 2015).
From those 116,204 SNPs, the HAP were constructed following 3 steps: (1) we recoded SNPs as A/B (0 = BB; 1 = AB; 2 = AA; and 3, 4, and 5 = 00); (2) we defined HAP with at least 2 SNP per block and a value of linkage disequilibrium (LD; r^2^) ≥ 0.80 (Mucha et al., 2019); finally (3) each HAP was recoded as pseudoSNP 0/1/2 corresponding to the absence of paternal and maternal alleles, the presence of 1 copy or 2 copies, respectively. The included HAP had a minimum frequency of 1% in the population (Teissier et al., 2020). For detection, inference, and recodification of HAP as pseudoSNP we used PLINK 1.07 software (Purcell et al., 2007; Calderón-Chagoya et al., 2023). Although this version of PLINK may have limitations in HAP inference, we compared the haplotypes found with PLINK 1.9 (Chang et al., 2015) and obtained the same results. Another option is to use Beagle v5.1 software package for haplotype inference (Browning et al., 2018; Bian et al., 2021). After quality control to HAP and SNPs (call rate <0.95, minor allele frequency <0.05, and Hardy–Weinberg equilibrium test <0.15), we retained 88,911 SNPs and 35,552 HAP. Furthermore, we detected ROH using PLINK 1.9 software (Chang et al., 2015) with the following parameters: –bfile–cow–geno 0.1–mind 0.1–homozyg group–homozyg-gap 500–homozyg-window-het 1–homozyg-window-missing 1–homozyg-window-snp 50–not-chr X (Meyermans et al., 2020; Pacheco et al., 2023). We found a total of 111,084 ROH, of which 17.5% were present in more than one animal; the number of ROH per animal was 32.74 ± 6.94, with an average length and covered genome of 7.38 ± 6.85 Mb and 241.80 ± 75.13 Mb, respectively. Previous details of ROH in the Holstein population of Mexico were published by Cortes-Hernández et al. (2022). After ROH detection, each ROH was recorded as 0 or 1 according to the absence or presence in each animal, leaving a total of 106,082 ROH for milk yield, 102,427 for SCS, and 89,419 for days open, allowing at least one ROH per animal; the difference in the number of ROH is due to the different number of records by trait.
For prediction analysis, only animals with all genomic and phenotype information were included, records of 5,746 cows with milk yield, 5,508 of SCS, and 4,707 records of days open, and we computed 3 genomic relationship matrices like kernels including SNPs, HAP, and ROH data for each trait. The prediction models were defined as follows:
where y is the analyzed trait (MY, SCS, and days open), HYS is the vector of herd-year-season (including 113 levels for milk yield, 105 for SCS, 95 for days open, with 50.85 ± 31.35 animals on average, and a minimum of 20 animals by level), AGE is the age vector in months (included also as a quadratic effect), u is a vector of genetic effects, and e is a vector of residuals. Matrices K1, K2, and K3 are linear or Gaussian kernels for the genomic effect. We computed 3 kernels, 2 linear kernels with SNPs and HAP, and 1 Gaussian kernel with ROH. The linear kernel with SNPs was computed as K_SNP_ = ZZ^T^/k, where Z is a matrix of centered and standardized SNP genotypes, k represents the number of SNP, and T is the transpose matrix. The linear kernel with HAP was computed as K_HAP_ = ZZ^T^/k, where Z is a matrix of centered and standardized HAP and k represents the number of HAP. The Gaussian kernel for ROH was obtained using the average squared-Euclidean distance between ROH as follows:
where a_i_ and a_j_ are elements of 2 animals to be compared, k refers to a particular ROH, and h is the bandwidth parameter (set as 1.5) chosen over a grid of values to maximize the prediction accuracy; dividing by the mean of the squared distances ensures that the overall scale of the distances is adjusted (De los Campos et al., 2010).
Prediction models were implemented within a Bayesian framework using Markov Chain Monte Carlo methods; each model was fitted with 100,000 iterations, 30,000 burn-in and thin 5. We retained 14,000 samples for inference. All analyses were performed using the Bayesian Generalized Linear Regression package using software R 4.4.2 (BGLR, Pérez and de los Campos, 2014). To assess how well the ridge regression models could predict unseen phenotypes, a 10-fold cross-validation was conducted. Records were randomly divided into 10 nonoverlapping subsets of roughly equal size (574 records for milk yield, 550 for SCS, and 470 for days open). In each cross-validation round, 9 subsets were used as the training set to estimate fixed and random effects, whereas the remaining subset served as the testing set to evaluate the predictive performance of the models.
In addition, single-step GBLUP (ssGBLUP) analyses were performed for each trait using the BLUPF90 software package (Misztal et al., 2002) to estimate the variance components by REML to compare the genetic variance explained with the other methodologies. The analyses included SNPs retained after quality control, fixed effects and the animal effect as specified in model 2, and a pedigree file comprising 16,944 individuals with an average depth of at least 3 generations.
Cows were on average 24.29 ± 2.04 mo of age, produced on average 12,990 ± 2,044 kg of milk adjusted 305 d, SCS of 1.17 ± 1.02, and had on average 122.1 ± 68.4 d open. The use of genomic information yielded higher predictive performance than using only fixed effects, except for the model that included K_ROH_. There were no differences in predictive performance comparing single versus multikernel models. Kernel models fitting SNPs yielded the best predictive performance: correlations between predicted and observed phenotypes for milk yield of 0.57 (0.56–0.58, 95% CI), 0.63 for SCS (0.63–0.64, 95% CI), and 0.20 for days open (0.19–0.22, 95% CI). In the models including only fixed effects the predictive performance was 0.48 (0.47–0.49, 95% CI) for milk yield, 0.62 (0.62–0.63, 95% CI) for SCS, and 0.17 (0.16–0.19, 95% CI) for days open (Figure 1).Figure 1. Predictive performance (Pearson correlation) in testing models for each trait. XB = model with only fixed effects; K_1_ = model with fixed effects and K_SNP_; K_2_ = model with fixed effects and K_HAP_; K_3_ = model with fixed effects and K_ROH_; K_1_ + K_2_ = model with fixed effects plus K_SNP_ and K_HAP_; K_1_ + K_3_ = model with fixed effects plus K_SNP_; and K_HAP_; K_2_ + K_3_ = model with fixed effects plus K_HAP_ and K_ROH_; K_1_ + K_2_ + K_3_ = model with fixed effects plus K_SNP_, K_HAP_, and K_ROH_. Each box represents the distribution of Pearson correlation values for the evaluated models (Q1–Q3); the horizontal line indicates the median (Q2), the whiskers represent variability, and the dots denote outliers.
The results obtained with ssGBLUP-REML of explained genetic variance indicated a higher estimation for milk yield (877,240 ± 74,775) compared with model 2, which included SNPs (861,677 ± 71,207). In contrast, for SCS and days open, the explained genetic variance was lower with ssGBLUP-REML; 0.057 ± 0.01 and 310.49 ± 74 compare with 0.07 ± 0.01 and 432.5 ± 71, respectively (Table 1). In addition, the accuracy of the prediction for milk yield was 0.73 ± 0.03, 0.55 ± 0.04 for SCS, and 0.49 ± 0.04 for days open, higher values for milk production and open days compared with the results of model 2, although these are different methodologies. These findings suggest that methodologies are not trait-dependent, and the selection of a specific method should be guided by the available data and computational resources. Nonetheless, models that integrate multiple sources of genetic information tend to capture a greater proportion of genetic variance to reproductive and health traits.Table 1. Total genetic variance calculated by ssGBLUP and the kernel models, and the percentage explained by each source of genomic information1TraitModelTotal ± SE% SNP% HAP% ROHMilk yieldssGBLUP877,240 ± 74,775100K_SNP_861,677 ± 71,207100K_HAP_830,437 ± 70,873100K_ROH_1,209,524 ± 225,372100K_SNP_ + K_HAP_898,422 ± 93,39258.941.1K_SNP_ + K_ROH_1,226,886 ± 84,38869.730.03K_HAP_ + K_ROH_1,195,514 ± 86,58568.631.4K_SNP_ + K_HAP_ + K_ROH_1,206,846 ± 103,28544.328.427.3SCSssGBLUP0.057 ± 0.01100K_SNP_0.07 ± 0.01100K_HAP_0.07 ± 0.01100K_ROH_0.17 ± 0.04100K_SNP_ + K_HAP_0.10 ± 0.0151.448.6K_SNP_ + K_ROH_0.17 ± 0.0237.762.3K_HAP_ + K_ROH_0.17 ± 0.0237.063.0K_SNP_ + K_HAP_ + K_ROH_0.17 ± 0.0124.323.252.5Days openssGBLUP310.5 ± 74100K_SNP_432.5 ± 71100K_HAP_415.7 ± 67100K_ROH_967.8 ± 303100K_SNP_ + K_HAP_545.3 ± 5851.448.6K_SNP_ + K_ROH_1,035.9 ± 14536.863.2K_HAP_ + K_ROH_1,040.5 ± 14835.364.7K_SNP_ + K_HAP_ + K_ROH_1,061.4 ± 11024.022.353.71K_SNP_ = kernel with SNPs; K_HAP_ = kernel with HAP; K_ROH_ = kernel with ROH (in all the models including fixed effects). = genetic variance; % SNP = percentage of the genetic variance explained by kernel with SNPs; % HAP = percentage of the genetic variance explained by kernel with HAP; % ROH = percentage of the genetic variance explained by kernel with ROH; ssGBLUP = single-step GBLUP.
Note that the variation explained (R^2^) by model 1 including only fixed effects was 0.26 ± 0.001 for milk yield, 0.41 ± 0.001 for SCS, and 0.06 ± 0.001 for days open. Interestingly, the R^2^ increased with the inclusion of 2 or more kernels, especially with the inclusion of K_SNP_ and K_ROH_: 0.68 ± 0.001 for milk yield, 0.64 ± 0.001 for SCS, and 0.50 ± 0.002 for days open, demonstrating that models that include genomic information yield a better goodness-of-fit. In addition, we found similar fixed effect estimates in each model and trait.
Although the models that included only fixed effects and the K_ROH_ for the 3 traits did not show predictive correlations at the same level as the other models that included only K_SNP_ or K_HAP_ (Figure 1), the largest estimations of genetic variance were obtained with models including K_ROH_, which is probably because we computed a Gaussian kernel. We can see in Table 1 that the inclusion of multiple kernels did not show a considerable increase in genetic variance compared with those models only including one type of kernel, specifically K_ROH_. But in the models with the combination of kernels, different percentages of genetic variance explained were found for each one; in the models with the inclusion of K_SNP_ and K_HAP_ for the 3 traits, it was K_SNP_ that showed the highest percentage of genetic variance explained (58.9%, 51.4%, and 51.4% for milk yield, SCS, and days open, respectively); for milk yield in the rest of models, K_SNP_ had the highest percentage of genetic variance explained: 44.3% in the model with 3 kernels and 69.7% in the model combined with K_ROH_. In contrast, for the health and reproductive traits, it was K_ROH_ that showed the highest percentage of genetic variance explained; for SCS in all models where the K_ROH_ was included the genetic variance was 0.17, and it was this same kernel that explained the highest percentage of genetic variance in the combined models (52.5%–53.0%). For days open, the model that included the 3 kernels was the one that showed the highest values of genetic variance (1,061.4 ± 110) and the K_ROH_ was the one that had the highest percentage of genetic variance explained (53.7%).
The greater proportion of explained genetic variance observed with K_ROH_ is probably because ROH captures information on both recent and distant relationships among individuals through alleles identical by descent, without being restricted to the number of generations represented in a pedigree. Moreover, when used as markers, ROH have the advantage that multiple loci within them may be in LD with a QTL, whereas a single SNP can only reflect the LD between one locus and a QTL (Luan et al., 2014). In contrast, HAP, which tend to be inherited together because of their physical proximity, may include combinations of alleles that are not exclusively homozygous, as is the case of ROH. These alleles according to their combination (homozygous or heterozygous) could differentially influence the expression of a trait due to the LD between causal mutations and HAP alleles, or greater ability to capture short-range epistatic effects (Ferdosi et al., 2016; Hess et al., 2017).
In Table 2 we can observe that the heritability for the 3 traits was found to be similar with the models that included K_SNP_ or K_HAP_ independent or combined: 0.27 ± 0.02 to 0.29 ± 0.02 for milk yield, 0.11 ± 0.01 to 0.14 ± 0.01 for SCS, and 0.09 ± 0.01 to 0.12 ± 0.01 for days open. In contrast, in the models that included the K_ROH_ and the other kernels the heritability was 0.36 ± 0.06 to 0.39 ± 0.06 for milk yield, 0.25 ± 0.06 to 0.26 ± 0.04 for SCS, and 0.20 ± 0.06 to 0.22 ± 0.05 for days open.Table 2. Heritability estimated (mean ± SE) by the kernels in genomic models and by ssGBLUP for milk yield, SCS, and days openItemMilk yieldSCSDays openKernel1 K_SNP_0.28 ± 0.020.12 ± 0.010.10 ± 0.02 K_HAP_0.27 ± 0.020.11 ± 0.010.09 ± 0.01 K_ROH_0.36 ± 0.060.25 ± 0.060.20 ± 0.06 K_SNP_ + K_HAP_0.29 ± 0.020.14 ± 0.010.12 ± 0.01 K_SNP_ + K_ROH_0.39 ± 0.030.26 ± 0.040.22 ± 0.05 K_HAP_ + K_ROH_0.38 ± 0.030.26 ± 0.040.22 ± 0.05 K_SNP_ + K_HAP_ + K_ROH_0.38 ± 0.030.26 ± 0.040.22 ± 0.04ssGBLUP0.41 ± 0.020.10 ± 0.010.07 ± 0.011K_SNP_ = kernel with SNPs; K_HAP_ = kernel with HAP; K_ROH_ = kernel with ROH (in all the models including fixed effects); ssGBLUP = single-step GBLUP.
For days open, the use of kernels shows levels of heritably higher than reported by Durán-Alvarez et al. (2023), which was 0.05 ± 0.004 in similar population of Mexican Holstein cows but without the use of genomic information and in cows with >3 lactations. Montaldo et al. (2010) reported levels of heritability of 0.01 ± 0.02 for calving interval at first lactation in Holstein cattle in Mexico, in another population of Holstein cows from the same country the heritability was of 0.06 ± 0.11 including 510 cows and with a 179 tag SNP panel (Zamorano-Algandar et al., 2021). In the case of SCS the heritability of a similar Holstein cattle population from the same country was 0.10 ± 0.02 without the inclusion of genomic information (Montaldo et al., 2010); this value was similar to the one found in the present study with the inclusion of K_SNP_ or K_HAP_, and in another Holstein cattle population from China the heritability was 0.24 ± 0.01 with the inclusion of genotypes of 984 individuals and 87,598 SNPs (Lu et al., 2023), similar to those found here with the inclusion of K_ROH_. This shows that the use of kernels contributes to increasing the estimation of the genetic variance and the heritability of the complex traits. Some studies claim that the use of Gaussian kernels can capture complex interactions of the genome, including nonadditive effects that are important for accurate phenotypic predictions compared with linear kernel that is expected to capture genetic signals through genomic relationships under additive inheritance (De los Campos et al., 2010).
Interestingly, we found that single kernel models for productive, health, and reproductive traits have good predictive abilities. That could be useful for hard-to-measure traits such as days open. To the best of our knowledge, this is the first study to apply multiple kernels for phenotypic prediction based on the construction of relationship matrices using SNPs, HAP, and ROH. The main advantage of kernel models is that they accommodate multiple sources of information provided that the kernels can be constructed from each information set (Morota and Gianola, 2014). Overall, we did not observe differences in prediction between kernels computed using SNPs and genomic-derived metrics. Remarkably, we found higher genetic variance when including Gaussian kernels for ROH, which could be explained by the mix of both additive and nonadditive effects.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ashja A.Zorc M.Dovc P.Genome-wide association study for milk somatic cell score in Holstein Friesian cows in Slovenia Animals (Basel)142024271310.3390/ani 1418271339335302 PMC 1142925139335302 · doi ↗ · pubmed ↗
- 2Baba T.Pegolo S.Mota L.F.M.Peñagaricano F.Bittante G.Cecchinato A.Morota G.Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle Genet. Sel. Evol.5320212910.1186/s 12711-021-00620-73372667233726672 PMC 7968271 · doi ↗ · pubmed ↗
- 3Bhati M.Kadri N.K.Crysnanto D.Pausch H.Assessing genomic diversity and signatures of selection in original Braunvieh cattle using whole-genome sequencing data BMC Genomics 2120202710.1186/s 12864-020-6446-y 3191493931914939 PMC 6950892 · doi ↗ · pubmed ↗
- 4Bian C.Prakapenka D.Tan C.Yang R.Zhu D.Guo X.Liu D.Cai G.Li Y.Liang Z.Wu Z.Da Y.Hu X.Haplotype genomic prediction of phenotypic values based on chromosome distance and gene boundaries using low-coverage sequencing in Duroc pigs Genet. Sel. Evol.5320217810.1186/s 12711-021-00661-y 3462009434620094 PMC 8496108 · doi ↗ · pubmed ↗
- 5Browning B.L.Zhou Y.Browning S.R.A one-penny imputed genome from next-generation reference panels Am. J. Hum. Genet.103201833834810.1016/j.ajhg.2018.07.0153010008530100085 PMC 6128308 · doi ↗ · pubmed ↗
- 6Calderón-Chagoya R.Vega-Murillo V.E.García-Ruiz A.Ríos-UtreraÁ.Martínez-Velázquez G.Montaño-Bermúdez M.Discovering genomic regions associated with reproductive traits and frame score in Mexican Simmental and Simbrah cattle using individual SNP and haplotype markers Genes (Basel)142023200410.3390/genes 1411200438002947 PMC 1067169538002947 · doi ↗ · pubmed ↗
- 7Chang C.C.Chow C.C.Tellier L.C.Vattikuti S.Purcell S.M.Lee J.J.Second-generation PLINK: Rising to the challenge of larger and richer datasets Gigascience 42015710.1186/s 13742-015-0047-82572285225722852 PMC 4342193 · doi ↗ · pubmed ↗
- 8Chen Z.Yao Y.Ma P.Wang Q.Pan Y.Haplotype-based genome-wide association study identifies loci and candidate genes for milk yield in Holsteins P Lo S One 132018 e 019269510.1371/journal.pone.019269529447209 PMC 581397429447209 · doi ↗ · pubmed ↗
