Polymorphisms of CYP7A1 and HADHB Genes and Their Effects on Milk Production Traits in Chinese Holstein Cows
Ao Chen, Qianyu Yang, Wen Ye, Lingna Xu, Yuzhan Wang, Dongxiao Sun, Bo Han

TL;DR
This study found genetic variations in two genes that affect milk production in Chinese Holstein cows, which could help improve dairy breeding.
Contribution
The study identifies specific SNPs in CYP7A1 and HADHB genes that are significantly associated with milk, fat, and protein yields in dairy cattle.
Findings
Eight SNPs in CYP7A1 and HADHB genes were significantly associated with 305-day milk, fat, and protein yields.
Three haplotype blocks formed by SNPs in these genes were also significantly linked to milk production traits.
Certain SNPs may alter gene expression by affecting transcription factor binding sites and mRNA stability.
Abstract
This study aimed to identify genetic variants related to differences in milk production in Chinese Holstein cows. By analyzing specific genes involved in lipid metabolism, researchers identified variants linked to the milk, fat, and protein yield cows produce over time. They also found patterns in these genetic variations that could affect how the genes work together. The results of this study suggest that these genetic markers could be useful for selecting cows with better milk-production traits. Additionally, the findings could help scientists better understand how these genes influence milk production, potentially leading to improved strategies for breeding dairy cattle. Overall, this research provides valuable insights into enhancing milk production in dairy farming, which could ultimately benefit both farmers and consumers. Our preliminary research proposed the cytochrome P450…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1- —STI 2030-Major Projects
- —National Key R&D Program of China
- —National Natural Science Foundation of China
- —Beijing Digital Agriculture Innovation Consortium Project
- —Program for Changjiang Scholar and Innovation Research Team in University
- —Youth Elite Development Program of College of Animal Science and Technology, China Agricultural University, China Agricultural University
- —China Agriculture Research System of MOF and MARA
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCancer-related molecular mechanisms research · Genetic and phenotypic traits in livestock · RNA modifications and cancer
1. Introduction
Milk, often referred to as “white gold”, plays a pivotal role as a rich source of essential nutrients supporting growth, bone-mass accrual, and immune function, containing proteins, fats, carbohydrates, minerals, and vitamins [1]. The surge in demand for high-quality dairy products is attributed to factors like global population growth, increasing income levels, and shifting dietary preferences [2]. This change in demand is particularly pronounced in developing regions, notably in Asia and Africa. Enhancing yield and quality traits represent paramount objectives for dairy breeding, necessitating the implementation of effective breeding strategies such as genomic selection (GS) to optimize these characteristics.
GS facilitates early and precise selection of young bulls and reserve heifers without relying on phenotypic data, thereby shortening the dairy breeding cycle from 5–6 years to about 2 years. By significantly reducing generation intervals, this approach lowers breeding costs and accelerates genetic progress. Therefore, improved accuracy in GS is crucial for the development of improved breeding strategies [3]. Single-nucleotide polymorphisms (SNPs), which are prevalent in genomic regions like promoters or enhancers, serve as primary genetic markers. They impact gene expression by altering how regulatory proteins bind or how chromatin is structured, thus affecting protein structure, RNA splicing [4], and miRNA binding [5]. They are crucial in influencing pathways associated with milk synthesis and lactation [6,7,8,9]. Integrating functional SNPs into genomic-selection chips can enhance the efficiency, accuracy, and sustainability of selection [10], leading to improved milk-production traits in cows.
The rapid development of -omics technology has facilitated the process of functional-gene/locus mining. Our preliminary research proposed that the cytochrome P450 family 7 subfamily A member 1 (CYP7A1) and hydroxyacyl-coenzyme A dehydrogenase trifunctional multienzyme complex beta subunit (HADHB) genes are differentially expressed in different lactation stages in dairy cows and are involved in lipid metabolism [11]. The CYP7A1 gene encodes cholesterol 7 alpha-hydroxylase, the rate-limiting enzyme in the classic pathway of bile-acid synthesis in the liver [12]. Bile acids are integral to processes such as lipid digestion, absorption, and excretion, as well as to the regulation of cholesterol homeostasis and metabolic pathways. The HADHB gene encodes hydroxyacyl-CoA dehydrogenase, a vital component of the mitochondrial trifunctional protein (MTP) complex, which is responsible for catalyzing the final three steps of mitochondrial beta-oxidation of long-chain fatty acids [13]. Furthermore, the CYP7A1 gene is positioned within the quantitative trait loci (QTL) QTL_ID: 2732 and QTL_ID: 3408, which have been demonstrated to affect milkfat yield and percentage and milk protein percentage [14,15,16]. HADHB is situated within the QTL region related to milk yield (QTL_ID: 1512) and is approximately 11.8 kb from the SNP rs110711742, which is associated with milk yield [17]. These findings suggest a potential association between CYP7A1 and HADHB and milk-production traits.
This study investigates the genetic associations between candidate genes CYP7A1 and HADHB and milk-production traits in dairy cattle, including 305-day milk yield, fat yield, fat percentage, protein yield, and protein percentage. Additionally, predictive analyses were used to assess the regulatory effects of SNPs, particularly the effects on transcription factor binding sites (TFBSs) and mRNA stability, offering insights to support further exploration of causal mutations and their potential applications to developing GS chips for the breeding of dairy cattle.
2. Materials and Methods
2.1. Animal Selection and Phenotypic Data Collection
We selected 898 dairy cows from 45 Chinese Holstein sire families on 22 farms in Beijing Sunlon Livestock Development Co., Ltd. (Beijing, China) as the experimental population. The 45 bulls were used for SNP identification, and 898 cows were used for association analysis (898 cows in the first lactation and 611 in the second lactation). Each sire family had an average of 21 daughters, and each cow had three generations of pedigree information and Dairy Herd Improvement (DHI) records that were provided by the Beijing Dairy Cattle Centre (Beijing, China) (Table S1). The cows in each sire family were distributed across various dairy farms and were maintained with the same feeding conditions. Data consisted of the milk-production phenotype of each cow for the whole lactation period of the parity, which comprised 305-day milk yield, fat yield, fat percentage, protein yield, and protein percentage. The study was conducted in accordance with Guide for the Care and Use of Laboratory Animals and was approved by the Institutional Animal Care and Use Committee (IACUC) at China Agricultural University (Beijing, China).
2.2. Genomic DNA Extraction
The frozen semen of 45 bulls and blood samples from 898 cows were provided by Beijing Dairy Cattle Center (Beijing, China). We extracted genomic DNA from the semen using the salting-out procedure and used a TIANamp Blood DNA Kit (Tiangen, Beijing, China) to extract DNA from the blood samples. The quantity and quality of extracted DNA were determined by a NanoDrop2000 spectrophotometer (Thermo Science, Hudson, NH, USA) and gel electrophoresis (1.5%), respectively.
2.3. Identification and Genotyping of SNPs in Candidate Genes
According to the genomic sequences of the two genes, CYP7A1 (NC_037341.1) and HADHB (NC_037338.1) of Bos taurus in Genbank, primers were designed using Primer3.0 (https://bioinfo.ut.ee/primer3-0.4.0/ (accessed on 8 September 2021)). They were then synthesized by Beijing Liuhe Bgi Co., Ltd. (Beijing, China). The primers covered the entire coding region and 2000 base pairs upstream and downstream of the regulatory regions of the genes. The DNA samples from semen were diluted to 50 ng/µL and used for PCR amplification (Table S2). PCR products were analyzed by 2% gel electrophoresis, and the qualified products were sequenced bidirectionally by Beijing Qingke Xinye Biotechnology Co., Ltd. (Beijing, China). Then, the sequences were aligned to the reference sequence (ARS-UCD1.2) using NCBI-blast (https://blast.ncbi.nlm.nih.gov/Blast.cgi (accessed on 20 November 2023)) to identify SNPs. Subsequently, we genotyped the SNPs in the 898 cows using Genotyping by Target Sequencing (GBTS) technology by Boruidi Biotechnology Co., Ltd. (Shijiazhuang, China). The allelic and genotypic frequencies were calculated, and the Hardy−Weinberg equilibrium was tested by the chi-squared test.
2.4. Linkage Disequilibrium (LD) Estimation
The extent of LD between the identified SNPs was estimated using Haploview 4.2 (Broad Institute of MIT and Harvard, Cambridge, MA, USA), with the solid spine algorithm. The haplotype block with a frequency greater than 0.05 was retained. The extent of LD is measured by the D′ value, to which it is proportional.
2.5. Association Analysis between SNP/Haplotype and Milk-Production Traits
The MIXED procedure in SAS 9.4 software (SAS Institute Inc., Cary, NC, USA) was used to perform association analysis between the SNP or haplotype block and the five milk-production traits (305-day milk yield, fat yield, fat percentage, protein yield and protein percentage) using the following animal model:
where y is the phenotypic value of each trait of each cow; μ is the overall mean; HYS is the fixed effect of farm (1–22: 22 farms), calving year (1–4: 2012–2015) and calving season (1: April–May; 2: June–August; 3: September–November and 4: December–March); M is the age at calving as a covariant, b is the regression coefficient of covariant M; G is the genotype or haplotype combination effect; a is the individual random additive genetic effect with a distribution of ; A is a pedigree-based relationship matrix with an additive genetic variance of ; and e is the random residual with a distribution of , where I is the unit matrix and is the residual variance. A Bonferroni correction was carried out by multiple tests with a significance level equal to the original p value divided by the number of genotype or haplotype combinations.
Furthermore, the additive, dominant, and allelic substitution effects of the SNPs were calculated by the following formulas:
where a is the additive effect; d is the dominant effect; α is the allelic substitution effect; , and are the least-squares means of milk-production traits for the corresponding genotypes; p is the frequency of allele A; and q is the frequency of allele B.
2.6. Prediction of Transcription Factor Binding Sites
We used Jaspar (http://jaspar.genereg.net/ (version 2023)) software to predict whether SNPs in the 5′ flanking region of CYP7A1 and HADHB genes changed the transcription factor binding sites (TFBS) (relative score ≥ 0.80).
2.7. Prediction of mRNA Structure
We used the NCBI database (https://blast.ncbi.nlm.nih.gov/ (accessed on 20 November 2023)) to query for synonymy or missense mutations on exons. Subsequently, we used RNAfold WebServer (http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi (accessed on 20 November 2023)) to predict the changes to the RNA secondary structure caused by the mutation in the untranslated region, with the parameters of minimum free energy (MFE), partition function, and avoidance of isolated base pairs. MFE was used for a direct comparison of the folding stability of RNAs of the same sizes; the smaller the MFE, the greater the stability.
3. Results
3.1. SNPs Identification
We identified five SNPs in the CYP7A1 gene and three in the HADHB gene in 898 dairy cows from 45 Chinese Holstein sire families on 22 farms. In CYP7A1, 14:g.24676921A>G (rs42765357), 14:g.24676224G>A (rs109454495) and 14:g.24675708G>T (rs42765359) were located in the 5′ flanking region; 14:g.24665961C>T (rs108958186), was found in the 3′ untranslated region (UTR); and 14:g.24664026A>G (rs109680813) was found in the 3′ regulatory region. In HADHB, 11:g.73256269T>C (rs110033443) was located in the 5′ flanking region, 11:g.73256227A>C (rs134856746) in the 5′ UTR, and 11:g.73242290C>T (rs137211407) in the intron between exons 5 and 6. The genotypic and allelic frequencies of all the identified SNPs are summarized in Table 1.
3.2. Single-Marker Association Analysis
The results of the association analysis (Table 2) showed that 14:g.24665961C>T, in CYP7A1, was significantly associated with milk yield (p value = 0.0443) and fat yield (p value ≤ 0.0001) in the first lactation period; four SNPs, 14:g.24676921A>G, 14:g.24676224G>A, 14:g.24675708G>T, and 14:g.24665961C>T, were significantly associated with milk, fat and protein yields (p value ≤ 0.0059) and 14:g.24664026A>G was highly significantly associated with fat yield (p value = 0.0037) in the second lactation.
In HADHB, all three SNPs, 11:g.73256269T>C, 11:g.73256227A>C, and 11:g.73242290C>T, were significantly associated with milk and protein yields in the first lactation (p value ≤ 0.0461) and with milk, fat, and protein yields in the second lactation (p value ≤ 0.001). Additionally, the results of the additive, dominant, and allelic substitution effects were shown in Table 3.
3.3. Haplotype Association Analysis
Using Haploview 4.2, five SNPs in CYP7A1, 14:g.24675708G>T, 14:g.24676224G>A, 14:g.24676921A>G, 14:g.24665961C>T and 14:g.24664026A>G, formed two haplotype blocks (Figure 1). In Block 1, the frequencies of haplotypes H1 (AC), H2 (GT) and H3 (AT) were 0.463, 0.342, and 0.194, respectively. Block 2 was composed of three haplotypes, H1 (TGA), H2 (GAG), and H3 (TGG), with frequencies of 0.349, 0.545, and 0.105, respectively. We found that the two blocks in CYP7A1 were significantly associated with milk, fat, and protein yields in both lactations (p value ≤ 0.0011; Table 4).
Similarly, two SNPs, 11:g.73256269T>C and 11:g.73256227A>C, in HADHB formed Block 3 (Figure 1). The frequencies of haplotypes H1 (AT), H2 (CC), and H3 (CT) were 0.159, 0.806, and 0.035, respectively. This block was found to be associated with milk, fat, and protein yields in the first lactation (p value ≤ 0.0315), and milk, fat, and protein yields in the second lactation (p value ≤ 0.0052; Table 4).
3.4. Changes of Transcription Factor Binding Sites Caused by SNPs in 5′ Region
Using the JASPAR, changes in TFBS were predicted for four SNPs in the 5′ regulatory region of the CYP7A1 and HADHB genes. As a result, it was found that, in CYP7A1, the allele G of 14:g.24676921A>G created a binding site (BS) for transcription factors (TFs) EBF1 and SP1, while the mutation from G to T of 14:g.24675708G>T eliminated the BS for ELK1 but created a BS for FOXC1 and ZNF354C. For HADHB, before and after mutation, 11:g.73256269T>C resulted in the disappearance of the BS for PAX2 and NR2F1 and the appearance of that for TFAP2A; in addition, the allele A of 11:g.73256227A>C created a BS for ZNF354C and allele C created a BS for HINFP (Table 5).
3.5. mRNA Structural Stability Altered by the Mutation in Untranslated Region
Using RNAfold, changes in the mRNA minimum free energy (MFE) were predicted. The mutation of C to T in SNP 14:g.24665961C>T causes the MFE of mRNA to change from −578.10 kcal/mol to −539.20 kcal/mol, resulting in a decrease in the structural stability of CYP7A1 mRNA.
4. Discussion
The CYP7A1 and HADHB genes have been implicated in milk-production traits in dairy cows, as indicated by recent research studies. CYP7A1 is known to influence triglyceride and cholesterol metabolism in the liver tissue of dairy cows and is a rate-limiting enzyme for bile-acid synthesis [18]; in chickens, HADHB plays a crucial role in liver lipid metabolism, particularly in inducing peroxisomal and mitochondrial β-oxidation activity [19]. These findings affirm the possible beneficial influence of these genes on milk production. Here, further identification of SNPs in the CYP7A1 and HADHB genes and analysis of their association with milk-production traits in dairy cows provide valuable information for understanding the mechanism of inheritance of these traits and linking the metabolic functions of these genes to their phenotypic effects. Furthermore, the integration of significant genetic sites from the CYP7A1 and HADHB genes into GS models can facilitate more precise selection of individuals with favorable milk-production traits. The purpose of these models is to predict the genetic advantage of the trait of interest based on the SNP profile of the individual, such that breeders can screen individuals for specific genetic variants associated with particular milk-production phenotypes, select them for breeding, and thus improve the efficiency of breeding.
In this study, we identified SNPs in the CYP7A1 and HADHB genes and confirmed their significant associations with milk-production traits in dairy cows. SNPs can cause phenotypic variation by affecting the function or expression of genes involved in various biological processes [20,21,22,23]. This study did find significant differences in the milk-production phenotypes of cows with different SNP sites in the CYP7A1 and HADHB genes. The number of individual cows with lower phenotypic values was also smaller, possibly because individuals of genotypes with higher-production phenotypes had been selected for while those with lower-production phenotypes had been eliminated during artificial breeding over the long term [24]. In addition, the phenotypic values of the cows and the significance of SNP/haplotype block associations were lower in the first lactation than in the second lactation. This difference could be explained by either of two possible causes. First, the varying numbers of cows in the two lactation periods may impact the statistical significance of the corresponding results. Second, there are physiological differences between the two lactation periods, as dairy cows tend to produce more milk in their second lactation [25].
Transcription factors are proteins that bind to specific DNA sequences and modulate the expression of target genes involved in various biological processes [26]. In this study, we found changes in TFBSs caused by SNPs in the 5′ flanking regions of the CYP7A1 and HADHB genes. These changes may result in altered regulation of gene expression by TFs, affecting the participation of the corresponding expression products in normal physiological and metabolic activities and thus, in turn, contributes to changes in milk production. For instance, in HADHB, TFAP2A is a specific nuclear transcription factor that may promote cell proliferation by positively regulating certain cell-proliferation-associated proteins such as EIA, SV40, and c-myc [27]. This TF binding with 11:g.73256269C may promote expression of the CYP7A1 gene and explain the significantly higher milk, fat and protein yields in the cows with genotype CC compared to those with genotype TT in the second lactation. Conversely, ZNF354C binds to form a number of transcriptional-repression complexes that inhibit the transcription of the intended target genes [28]. Binding of the HDAHB gene to this TF at 11:g.73256227A may result in the repression of its expression, which may have contributed to the significantly lower production of milk, fat, and milk proteins by the individuals with genotype AA than by those with CC in the second lactation.
Although the region is situated within gene exons, the term “untranslated region” (UTR) refers to the non-coding segment located at the mRNA molecule’s terminus. Consequently, SNPs found within the UTR do not exert a direct influence on transcription and translation processes and are typically regarded as neutral variants. However, despite their lack of direct impact on encoded proteins, such SNPs in UTRs may change the mRNA’s minimum free energy, leading to reduced mRNA stability. The synonymous mutation 14:g.24665961C>T, located in the untranslated region of the CYP7A1 gene, reduces mRNA stability, resulting in reduced efficiency of transcription and translation of this gene, possibly affecting triglyceride and cholesterol metabolism [29,30], which may explain why the amount of milkfat yield in second lactation was significantly higher in the individuals with the genotype CC than in those with the genotype TT at the SNP site 14:g.24665961C>T.
5. Conclusions
This study identified polymorphisms in the CYP7A1 and HADHB genes and their significant genetic effects on milk-production traits in Chinese Holstein cows, offering genetic markers for GS of dairy cattle; thus, four SNPs that altered TFBSs, 14:g.24676921A>G, 14:g.24675708G>T, 11:g.73256269T>C, and 11:g.73256227A>C, were proposed as possible causative mutations for the formation of milk-production traits. Further in-depth verification is needed before these SNPs can be used to provide genetic information to support the breeding of dairy cattle.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Pereira P.C. Milk nutritional composition and its role in human health Nutrition 20143061962710.1016/j.nut.2013.10.01124800664 · doi ↗ · pubmed ↗
- 2Khaw K.-T. Friesen M.D. Riboli E. Luben R. Wareham N. Plasma phospholipid fatty acid concentration and incident coronary heart disease in men and women: The EPIC-Norfolk prospective study P Lo S Med.20129 e 100125510.1371/journal.pmed.100125522802735 PMC 3389034 · doi ↗ · pubmed ↗
- 3Vignal A. Milan D. San Cristobal M. Eggen A. A review on SNP and other types of molecular markers and their use in animal genetics Genet. Sel. Evol.20023427530510.1186/1297-9686-34-3-27512081799 PMC 2705447 · doi ↗ · pubmed ↗
- 4Lalonde E. Ha K.C. Wang Z. Bemmo A. Kleinman C.L. Kwan T. Pastinen T. Majewski J. RNA sequencing reveals the role of splicing polymorphisms in regulating human gene expression Genome Res.20112154555410.1101/gr.111211.11021173033 PMC 3065702 · doi ↗ · pubmed ↗
- 5Gong J. Tong Y. Zhang H.M. Wang K. Hu T. Shan G. Sun J. Guo A.Y. Genome-wide identification of SN Ps in micro RNA genes and the SNP effects on micro RNA target binding and biogenesis Hum. Mutat.20123325426310.1002/humu.2164122045659 · doi ↗ · pubmed ↗
- 6Raven L.-A. Cocks B.G. Pryce J.E. Cottrell J.J. Hayes B.J. Genes of the RNASE 5 pathway contain SNP associated with milk production traits in dairy cattle Genet. Sel. Evol.2013452510.1186/1297-9686-45-2523865486 PMC 3733968 · doi ↗ · pubmed ↗
- 7Fang M. Fu W. Jiang D. Zhang Q. Sun D. Ding X. Liu J. A multiple-SNP approach for genome-wide association study of milk production traits in Chinese Holstein cattle P Lo S ONE 20149 e 9954410.1371/journal.pone.009954425148050 PMC 4141689 · doi ↗ · pubmed ↗
- 8Ibeagha-Awemu E.M. Peters S.O. Akwanji K.A. Imumorin I.G. Zhao X. High density genome wide genotyping-by-sequencing and association identifies common and low frequency SN Ps, and novel candidate genes influencing cow milk traits Sci. Rep.201663110910.1038/srep 3110927506634 PMC 4979022 · doi ↗ · pubmed ↗
