A hypomorphic SRD5A2 haplotype with a potential founder effect: composed of common variants in individuals with 5α-reductase type 2 deficiency from South China
Xiaoyun Lei, Xu Zhou, Zifeng Cheng, Sen Zhao, Chunrong Gui, Yunting Ma, Meizhen Shi, Xianda Wei, Bobo Xie, Xin Fan, Shaoke Chen, Qiuxing Tao, Yuna Su, Dejian Yuan, Baoheng Gui

TL;DR
A new SRD5A2 gene variant combination is linked to a common sex development disorder in southern China, suggesting a genetic founder effect and altered enzyme function.
Contribution
Identified a hypomorphic SRD5A2 haplotype (Hap3: G-A) with a potential founder effect in southern China, showing combined structural and functional impacts in 5α-RD2 deficiency.
Findings
Hap3: G-A is highly prevalent in 5α-RD2 deficiency cases in southern China and is associated with microphallus.
Structural changes from Hap3: G-A include increased solvent-accessible surface area and disrupted hydrogen bonds.
Kinetic assays show Hap3: G-A has intermediate catalytic efficiency, indicating a hypomorphic effect.
Abstract
Disorders of sex development (DSDs) exhibit high genetic and phenotypic heterogeneity, and genotype–phenotype correlations are not fully understood. 5α-Reductase type 2 (5α-RD2) deficiency, a common form of DSD, is caused by SRD5A2 inactivation. This study investigated the role of SRD5A2 haplotypes in DSD, focusing on their corresponding phenotypes, structural changes and impacts on enzyme activity. This study enrolled 216 individuals with DSD who underwent genetic analysis and 2,794 controls. Linkage disequilibrium analysis was performed in individuals with 5α-RD2 deficiency to identify SRD5A2 haplotypes, and haplotype frequencies were analysed across cohorts. The clinical manifestations of individuals with different SRD5A2 haplotypes were characterized. Structural predictions were employed to investigate the impacts of haplotypes on the 5α-RD2 structure and interactions with ligands.…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6- —the Joint Project on Regional High-Incidence Diseases Research of Guangxi Natural Science Foundation
- —the First-class Discipline Innovation-driven Talent Program of Guangxi Medical University
- —the Guangxi Green Seeding Talent Inclusive Support Policy
- —the Candidates for the First Cohort of the Medical Youth Reserve Talent Cultivation Program in Guangxi Zhuang Autonomous Region
- —National Natural Science Foundation of China
- —the Guangxi Bagui Program for Outstanding Young Talents
- —the special Scientific Research Fund of Guangxi Ten-Hundred-Thousand Talents Project
- —the Initial Scientific Research Fund for Advanced Talents from the Second Affiliated Hospital of Guangxi Medical University
- —the National High Level Hospital Clinical Research Funding
- —the Guangxi Medical University Training Program for Distinguished Young Scholars
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSexual Differentiation and Disorders · Prostate Cancer Treatment and Research · Urinary Bladder and Prostate Research
Introduction
Rare congenital conditions characterized by incongruence among chromosomal, gonadal, and phenotypic sex characteristics are medically classified as differences (or disorders) of sex development (DSDs) [1–3]. According to the consensus, DSDs are classified into three groups: 46, XY DSD; 46, XX DSD; and sex chromosomal DSD [4]. Among these, 46, XY DSD is considered the most complex type, affecting approximately 1 in 6,000 live births [4, 5]. The genotype–phenotype correlation among DSD individuals has still not been fully elucidated, even for those carrying the same variant [6]. 5α-Reductase type 2 (5α-RD2) deficiency (OMIM: #264600), a rare autosomal recessive disorder, a relatively common type of 46, XY DSD, is caused by a lack or deficiency of 5α-RD2 [7]. Since the first reported 5α-RD2 deficiency individuals [8], an increasing number of cases with many types of features ranging from nearly female external genitalia to undervirilized male genitalia, including penile hypospadias and isolated microphallus have been described [9]. Even siblings from the same family who carry the same variant present different gender identities [10].
5α-RD2 plays a crucial role in catalyzing the conversion of testosterone to the more potent dihydrotestosterone (DHT) via the use of NADPH as a hydrogen donor during the hormone biosynthetic process [8, 11]. 5α-RD2 is highly expressed in androgen-sensitive tissues, including the prostate, external genitalia, epididymis, and foreskin, during fetal development [12]. Thus, 5α-RD2 is important for regulating the differentiation and formation of male external genitalia by maintaining an appropriate DHT level. During embryonic development, in the absence of DHT stimulation, 46, XY DSD can present with a wide spectrum of phenotypes, ranging from female-typical external genitalia to clitoromegaly or microphallus, anatomic anomalies such as cloacal exstrophy, hypospadias, and cryptorchidism [13–15].
5α-RD2 is encoded by the SRD5A2 gene (OMIM: #607306). Different variants of SRD5A2 result in variable degrees of enzyme activity and are found in patients with diverse clinical manifestations. For example, the frameshift variant c.655delT led to a complete loss of enzymatic activity, and this variant was correlated with penoscrotal hypospadias, a clitoris-like phallus [16]. Other types of variants, such as missense variants, result in a partial reduction in 5α-RD2 enzymatic activity. For example, the variant SRD5A2 c.680G > A p.R227Q can largely reduce enzyme activity to 3.2% and is found in individuals with clitoromegaly or clitoris penis as well as hypospadias [17], which are considered pathogenic or likely pathogenic [18]. However, other missense variants, such as SRD5A2 c.265C > G p.L89V, reportedly mildly decrease the activity of the enzyme to 80–92% [17, 19–21], and c.265C > G is common in healthy individuals according to the Genome Aggregation Database (gnomAD) [22]. This variant is often considered benign [17, 18]. The pathogenesis of SRD5A2 missense variants remains controversial, primarily because the correlation between the threshold of enzyme activity reduction and phenotypes remains inconclusive [23], highlighting the complex association between genotype and phenotype and suggesting that other factors possibly play a role in DSD pathogenesis.
Our previous multicenter study revealed that the c.680G > A variant is a hotspot site in the Chinese population, especially in South China, indicating a potential founder gene effect of SRD5A2 [24]. Interestingly, we found that some individuals with 5α-RD2 deficiency harboring the c.680G > A variant also carried c.265C > G [24]. This suggested that such variants might be in linkage and compose a haplotype, further implying a possible founder effect of SRD5A2. Moreover, it remains unclear whether the effect of a haplotype is equivalent to the accumulated or overlapping effects of two or more linked variants. To the best of our knowledge, few studies have examined the founder effect of SRD5A2 from the perspective of haplotype analysis, especially analysis based on c.680G > A [10, 25]. Some haplotypes are hypomorphic, which results in partial loss of gene function but not complete null; for example, a common hypomorphic haplotype of TBX6 is found in vertebral malformation, defined by the nonreference alleles of three common single-nucleotide variants (SNVs), which cause only a moderate reduction in protein function [26]. Therefore, exploring hypomorphic haplotypes may provide novel insights into the intricate genotype–phenotype relationships for diseases with high heterogeneity in genetic factors and clinical manifestations.
In this study, linkage disequilibrium (LD) analysis focusing on SRD5A2 variants was performed, and haplotypes composed of c.265C > G and c.680G > A were identified. The frequency of the specific haplotype was calculated in individuals with 5α-RD2 deficiency or non-5α-RD2 deficiency DSD and further compared with controls from in-house and public cohorts. The clinical manifestations of individuals with different haplotypes were analysed. Additionally, to elucidate the molecular mechanisms of the haplotype in 5α-RD2 deficiency, we sought to predict the structural effects of the haplotype on enzyme stability and binding with NADPH and testosterone, and further functional assays were performed to validate the influence on enzyme activity.
Methods
Participant recruitment and phenotype evaluation
Participants were recruited from the genetic and endocrine metabolism specialist outpatient departments of pediatrics at the Second Affiliated Hospital of Guangxi Medical University and Liuzhou Hospital of Guangzhou Women and Children’s Medical Center from 2015 to 2024. They were referred to pediatric endocrinologists for clinical evaluation according to the criteria of the European Society for Pediatric Endocrinology and Lawson Wilkins Pediatric Endocrine Society, followed by molecular genetic analysis at their respective hospitals. Individuals assigned male at birth who exhibited external genital features indicative of incomplete virilization, including hypospadias, microphallus, and cryptorchidism, were enrolled in the study. Additionally, phenotypically female individuals with a 46,XY karyotype were also included (Supplementary Tables S1 and S3). The study controls contained two subsets: the in-house controls consisted of 290 unrelated individuals without apparent genital malformations (Supplementary Table S5), and the public controls comprised 2,504 unrelated healthy individuals from the 1000 Genomes Project (Supplementary Table S7) [27]. The enrolled patients and in-house controls were mainly from South China. The project was approved by the Institutional Medical Ethics Review Board of the Second Affiliated Hospital of Guangxi Medical University (2021-KY-0014). Informed consent was obtained from the parents of the participants.
Genetic analysis by DSD-panel or whole-exome sequencing
The genomic DNA extracted from the peripheral blood was used for library preparation, where different capture kits were utilized depending on the target regions. For a subset of patients in the 5α-RD2 deficiency cohort, molecular genetic testing was based on next-generation sequencing (NGS) focused on a comprehensive panel containing DSD-related genes (Supplementary Table S8). For this DSD-related gene panel, the SureSelect Custom Library Kit (Agilent, Santa Clara, CA, USA) was used. For the remaining individuals with DSD, including the majority of individuals with 5α-RD2 deficiency et al., and other in-house controls, whole-exome sequencing (WES) was performed. For WES, the SureSelect Human All Exon V6 Library Kit (Agilent) or IDT xGen Exome Research Panel V2 probes were used. The prepared libraries were sequenced at an average depth of more than 100 × on high-throughput NGS platforms, such as the Illumina NovaSeq 6000 series, according to the manufacturer’s instructions.
Sequencing data analysis and haplotype calling
The raw sequencing reads were subjected to quality control via NGSQCToolkit, and poor-quality reads were removed, followed by alignment to the reference human genome sequence via BWA software [28, 29]. After excluding the duplicated reads and performing statistical analysis on the remaining reads, variants were called via GATK software [30]. The called variants were annotated based on public databases for variant records and population frequencies, such as gnomAD [22]. Candidate variants were further confirmed by Sanger sequencing.
To identify SNVs in LD with c.680G > A, only variants located in the SRD5A2 region and 2 Mb upstream or downstream were included. For the probands in the 5α-RD2 deficiency cohort, pairwise LD measures (D’ and R^2^), i.e., the estimated haplotype frequencies and those expected under linkage equilibrium, were statistically inferred via the expectation maximization (EM) algorithm using PLINK software (https://zzz.bwh.harvard.edu/plink/) [31]. The c.680G > A and its flanking linked SNVs were used to construct different haplotypes. For individuals homozygous for specific loci, haplotypes can be directly determined. For those remaining heterozygous for these loci, haplotypes can be assigned by pedigree analysis if their family members’ genotypes are available and informative. To evaluate the population differentiation of the assigned haplotype, FST values across populations were calculated via VCFtools version 0.1.17.
Three-dimensional (3D) structure prediction and model construction for 5α-RD2 with NADPH and testosterone
To investigate the effects of SRD5A2 haplotypes on the enzyme itself and its interactions with the cofactor NADPH and the substrate testosterone, a 5α-RD2 structure model was built based on the resolved crystal structure (Protein Data Bank ID: 7BW1) [7]. Since the 7BW1 structure contains unresolved residues (N-terminus 1–4 and loop L1 residues 39–43) and lacks the corresponding ligand model, the full-length structure was predicted and completed via Chai-1 [32]. Restrained optimization was then performed to construct four ternary models containing NADPH and testosterone, each in combination with 5α-RD2 encoded by wild-type (WT) or mutant haplotypes. During construction, residue 227 of 5α-RD2 was restrained within 5 Å of NADPH (according to the ligand interaction sites in the 7BW1 structure), and residue 114 of 5α-RD2 was restrained within 5 Å of testosterone to compensate for the missing parts and to ensure that the structure was near the active site. To analyse the solvent-accessible surface area (SASA), interaction networks, and key interacting residues of the models, PyMOL software was used [33]. The binding modes and active site changes caused by haplotypes were compared to provide a structural basis, which is helpful in predicting enzyme catalytic activity under WT and mutant haplotypes.
Cell culture and SRD5A2 plasmids transfection
HEK293T cells were cultured in 293 T cell-specific medium (CM-0005, Procell, Wuhan, China) at 37 °C with 5% CO_2_. The WT SRD5A2 plasmid was constructed through linearized vectors obtained by digestion with restriction endonucleases. The resulting PCR amplimers were digested with XbaI and BsrGI and then ligated into the plasmid 3Flag-EGFP (supplied by Shanghai Genechem Co., Ltd.). The resulting plasmids were identified via Sanger sequencing (forward primer: 5′-CGCAAATGGGCGGTAGGCGTG-3′; reverse primer: 5′-TTATTAGGAAAGGACAGTGGG-3′). The plasmid was subsequently transformed into E. coli and further validated via Sanger sequencing. The cells were transfected with Lipofectamine 2000 Transfection Reagent (11668–019; Invitrogen, California, USA) according to the manufacturer’s instructions. The cells were collected after 48 h of transfection, and western blotting (WB) was performed to validate 5α-RD2 overexpression via primary antibodies (TA-05 and TA-08, ZSGB-BIO, Beijing, China) and secondary antibodies (ZB-2305, ZSGB-BIO, Beijing, China).
Kinetic assays of 5α-RD2 by liquid chromatography-tandem mass spectrometry (LC–MS/MS)
HEK293T cells transfected with control vector GFP, SRD5A2 WT or haplotypes were cultured with media supplemented with 10 μM testosterone and various concentrations of NADPH (0.0625–2 μM) (CAS: 58–22-0, N8100; Solarbio, Beijing, China) or 500 μM NADPH and various concentrations of testosterone (0.25–8.0 μM) (CAS: 2646–71-1, IT0110; Solarbio, Beijing, China). After 4 h, the medium was collected, and steroid hormones were extracted with ethyl acetate. Analytes were subjected to protein precipitation combined with liquid extraction for LC–MS/MS machine (ACQUITY UPLC I-Class IVD/Xevo TQ-S IVD System, Waters Corporation, Milford, USA) analysis, with mobile phase A containing ammonium fluoride and phase B being alcohol methanol, which was quantified according to the peak steroid hormone area. Protein concentrations were determined via BCA assays (P0012S, Beyotime, Shanghai, China). The rate of DHT production (pmol/mg protein/h) was calculated. The enzyme kinetics was analysed via nonlinear regression (curve fit) based on the Michaelis–Menten model in GraphPad Prism version 10.1.2. This analysis generated the maximum velocity (Vmax) values of the enzymes encoded by the specific haplotype.
Statistical analysis
Statistical analysis was performed via SPSS version 27 (IBM). Two-sided Fisher’s exact test was used to evaluate differences in haplotype frequencies among cohorts, including individuals with 5α-RD2 deficiency, individuals with non-5α-RD2 deficiency DSD, in-house and public controls. For all subsequent pairwise comparisons, the Bonferroni correction was applied. Statistical significance was defined as p < 0.05. GraphPad Prism version 10.1.2 was used for visualization.
Results
Enrollment of the case and control cohorts
In total, 39 individuals with 5α-RD2 deficiency were enrolled. Among them, 26 cases underwent WES, while the remaining cases underwent panel sequencing targeting DSD-related genes (Supplementary Table S1). A total of 177 individuals with non-5α-RD2 deficiency DSD were subsequently recruited (Supplementary Table S3). For comparison, we collected 290 in-house controls without DSD-related phenotypes, primarily from South China (Supplementary Table S5), as well as public controls of 2,504 individuals who claimed themselves to be healthy from diverse ethnicities and geographical regions, including southern Han Chinese and Dai individuals in Xishuangbanna, China (Supplementary Table S7) [27].
Assignment of SRD5A2 haplotypes in the case and control cohorts
Considering that WES is more comprehensive than the DSD panel in calling variants, only the WES dataset of individuals with 5α-RD2 deficiency was used to construct haplotypes. The LD pattern between the SNV rs9332964 (c.680G > A) and all the flanking SNVs (located in the 2 Mb range upstream and downstream of the SRD5A2 gene) was analysed (Fig. 1A). Many loci were found to be in LD with rs9332964, and eight loci, such as rs538136903 and rs522638, had the highest correlation (D’ = 1, R^2^ = 1) (Fig. 1B). Some loci were located upstream of SRD5A2, some downstream, and some inside the gene, which were located in exons or introns (Fig. 1C). Since gene function is influenced mainly by exons, we selected rs523349 (c.265C > G) for rs9332964 (c.680G > A)-based haplotype assignment (Fig. 1C). On the basis of reference and altered allele combinations of the two loci, there were WT haplotype C-G and three alternative haplotypes: haplotype G-G was defined as Hap1, C-A as Hap2, and G-A as Hap3.Fig. 1. The SRD5A2 haplotype is especially prevalent in individuals with 5α-RD2 deficiency. A Location information of the SRD5A2 gene. B Linkage of rs9332964 (SRD5A2 c.680G > A) with other loci. C. Position information of loci associated with rs9332964 on SRD5A2. D SRD5A2 Hap3: G-A is composed of nonreference alleles of two single-nucleotide variants (SNVs), rs523349 and rs9332964, and this haplotype’s frequency profile in the case and control cohorts. All P values were calculated via two-sided Fisher’s exact tests. ** denotes p < 0.001. E Two pedigrees with SRD5A2 Hap3: G-A. Squares denote male members, circles represent female pedigree members, solid symbols represent members of individuals with 5α-RD2 deficiency, and open symbols represent unaffected members; the probands are indicated by black arrows. The nonreference alleles of rs523349 and rs9332964 are shown in blue. The question mark denotes uncertainty
For individuals with 5α-RD2 deficiency (Supplementary Table S1), five out of 39 were heterozygous at both SRD5A2 c.265C > G and c.680G > A; for these five individuals, their parental genotype information was unavailable or incomplete; thus, haplotype status could not be clearly determined (Supplementary Table S2). For individuals with non-5α-RD2 deficiency DSD (Supplementary Table S3), one out of 177 was heterozygous at both sites, and the parental genotype was unavailable for haplotyping (Supplementary Table S4). For in-house controls (Supplementary Table S5), ten out of 290 individuals were heterozygous at both sites; among the ten individuals, six obtained haplotyping results based on their family members’ genotypes, but the remaining four failed to be haplotyped (Supplementary Table S6). Among the public controls, one out of 2,504 was heterozygous at both loci and was unable to be assigned haplotypes due to missing information of the family genotypes (Supplementary Table S7).
SRD5A2 Hap3: G-A may constitute a founder effect of the population in South China
To investigate the distribution of haplotypes, their frequencies in the cases and controls were calculated. Hap3: G-A in individuals with 5α-RD2 deficiency was the highest at 64.71%, which was significantly different from that in the other case and control cohorts (Fig. 1D). Interestingly, the frequency of Hap3: G-A in non-5α-RD2 deficiency DSD cases without known DSD-related gene variants was 2.59%, which was higher than that in non-5α-RD2 deficiency DSD cases with known DSD-related gene variants, as well as other control cohorts (Fig. 1D). Specifically, seven non-5α-RD2 deficiency DSD cases without known DSD-related gene variants, harboring Hap3: G-A, were identified (Table 1). We further analysed five pedigrees with 5α-RD2 deficiency, in which the parental genotypes were available and informative, and found that Hap3: G-A was inherited rather than de novo, for example, pedigrees 1 and 2, as shown in Fig. 1E.Table 1. Seven non-5α-RD2 deficiency DSD cases without known DSD-related gene variants were carriers of hap3: G-AIDGenderAge (Year)PhenotypeHaplotypesDSD-2-14Male1.6Microphallus-Hap1--Hap3-DSD-2-86Male2Microphallus-Hap1--Hap3-DSD-2-123Male0.3Microphallus-Hap1--Hap3-DSD-2-133Male10Microphallus-Hap1--Hap3-DSD-2-169Male10.8Concealed penis and testicular calculi-Hap1--Hap3-DSD-2-176Male11Microphallus-Hap1--Hap3-DSD-2-177Male13Microphallus-Hap1--Hap3-
The distribution of Hap3: G-A in populations worldwide was further explored. In general, there are five types of populations: East Asian populations (including Han Chinese in Beijing (CHB), etc.), European populations (including Toscani in Italia (TSI), etc.), African populations (including Yoruba in Ibadan, Nigeria (YRI), etc.), and mixed-race Americans (including Mexican Ancestry from Los Angeles, USA (MXL), etc.). The population frequency of Hap3: G-A was 1.57% in our South China in-house group, 1.63% in Chinese Dai individuals in Xishuangbanna (CDX), and 1.43% in Southern Han Chinese (CHS) individuals but was absent in other ethnic groups. This finding suggests that Hap3: G-A may constitute a founder effect in southern Chinese individuals (Fig. 2). To further evaluate the population differentiation of Hap3: G-A, c.265C > G was chosen as the leading allele for the haplotype because it is common in the general population. Specifically, we calculated FST values for c.265C > G across populations (Table 2), which exhibited relatively high population differentiation between southern Chinese individuals (CDX and CHS) and populations of African, European or American ancestry but relatively low population differentiation between southern Chinese individuals and populations of East or South Asian ancestry (Table 2).Fig. 2. The distribution of SRD5A2 Hap3: G-A worldwide. The public controls from South China consisted of southern Han Chinese (CHS, n = 105) and Dai individuals in Xishuangbanna, China (CDX, n = 92). The size of the circles denotes individuals’ numbers of specific populations, and the color denotes the frequency of Hap3: G-ATable 2FST values of southern Chinese (CHS and CDX) individuals of East Asian ancestry compared with those of other populations from the 1000 genomes projectRacesClassThe FST values of c.265C > GPopulation differentiationCHBEast Asian ancestry0.00FST = 0: extremely lowKHVEast Asian ancestry0.00BEBSouth Asian ancestry0.010 < FST < 0.05: lowPJLSouth Asian ancestry0.04STUSouth Asian ancestry0.04PELAmerican ancestry0.04MXLAmerican ancestry0.050.05 < FST < 0.15: medianJPTEast Asian ancestry0.05ITUSouth Asian ancestry0.06GIHSouth Asian ancestry0.08TSIEuropean ancestry0.09PURAmerican ancestry0.11ASWAfrican ancestry0.13CEUEuropean ancestry0.15GBREuropean ancestry0.150.15 < FST < 0.25: highFINEuropean ancestry0.16LWKAfrican ancestry0.16ESNAfrican ancestry0.16CLMAmerican ancestry0.17ACBAfrican ancestry0.17YRIAfrican ancestry0.20IBSEuropean ancestry0.20GWDAfrican ancestry0.260.25 < FST: extremely highMSLAfrican ancestry0.27
Phenotypic analysis of individuals with different SRD5A2 haplotypes
Individuals with DSD presented mainly hypospadias, microphallus, and cryptorchidism (Fig. 3A–C). Since 5α-RD2 deficiency is autosomal recessive, only individuals homozygous for specific haplotypes were included to further characterize the phenotype spectrum. Among the 21 homozygotes of Hap3: G-A, 47.62% (10/21) presented isolated microphallus, followed by microphallus with hypospadias (28.57%, 6/21), microphallus with hypospadias and other DSD features (14.28%, 3/21), microphallus with other DSD features (4.76%, 1/21), and other manifestations (4.76%, 1/21) (Fig. 3D). Among the six homozygotes of Hap2: C-A, 83.33% presented isolated microphallus, and the remaining individual in this group presented microphallus with other DSD features (Supplementary Table S1). Interestingly, seven other non-5α-RD2 deficiency DSD cases without known DSD-related gene variants were carriers of Hap3: G-A, six of which presented isolated microphallus (Table 1).Fig. 3. Phenotypic analysis of individuals homozygous for SRD5A2 Hap3: G-A. A–C Schematic diagram of hypospadias, microphallus, and the cryptorchidic phenotype. D Clinical manifestations in individuals with 5α-RD2 deficiency and homozygosity for Hap3: G-A (n = 21). The pie charts show the constituent ratios of different clinical phenotypes in each subcategory. † denotes different locations of the urethral meatus [24, 45]. ‡ denotes microphallus, which was defined as a stretched penile length < 2.5 SDs below the mean for normal same-age individuals [24, 45]
Analysis of intramolecular interactions within 5α-RD2
To investigate the structural impact of variants on 5α-RD2, Chai-1 and PyMOL were used for modelling and visualization, respectively. Interaction analyses of intramolecular hydrogen bonds were conducted, which were among residues within a 5 Å distance of two mutated sites (positions 89 and 227). The impact of the variants on the hydrogen bond network was localized, which was reflected mainly in changes in the number and distribution of hydrogen bonds around key residues. For WT 5α-RD2, in total, there were 14 hydrogen bonds, including those between L89 and G85 and between R227 and F223 (Fig. 4A), suggesting relatively strong interactions within 5α-RD2. In contrast, for Hap1: G-G-derived 5α-RD2, the number of hydrogen bonds (n = 14) was the same as that in the WT, and the hydrogen bond between V89 and G85 remained (Fig. 4A, B and E). However, the hydrogen bond between M157 and F153 was lost, whereas that between F229 and L226 gained (Fig. 4E). For Hap2: C-A-derived 5α-RD2, the number of hydrogen bonds (n = 16) was slightly greater than that in the WT (n = 14) (Fig. 4A, C, and F). However, notably, the key hydrogen bond between R227 and F223 was replaced by a hydrogen bond between Q227 and F223 (Fig. 4F). In the case of Hap3: G-A, the number of hydrogen bonds decreased (WT: 14 vs. Hap3: 13), and the hydrogen bond network was redistributed. The hydrogen bond between V89 and G85 was maintained at 2.9 Å, whereas the hydrogen bond distance between Q227 and F223 was 3.0 Å (Fig. 4A, D, and G). Hap3: G-A resulted in relatively large changes in the hydrogen bond network, ultimately affecting the overall stability of 5α-RD2.Fig. 4. Stability analysis of 5α-RD2 based on hydrogen bonds. A–D Visualization of hydrogen bonds within a 5 Å radius around amino acid positions 89 and 227 of the wild-type (WT): C-G (A), Hap1: G-G (B), Hap2: C-A (C), and Hap3: G-A (D) derived 5α-RD2, respectively. Residues 89 and 227 are highlighted with a red box. Hydrogen bonds are represented as blue dashed lines. E–G Comparison of the number and distribution of hydrogen bonds around amino acid positions 89 and 227 within 5α-RD2 between WT and Hap1 (E), Hap2 (F), and Hap3 (G)
SASA analysis of the 3D structure composed of 5α-RD2 with NADPH and testosterone
Four ternary complex models were constructed, each consisting of 5α-RD2 (derived from WT, Hap1, Hap2, or Hap3), NADPH, and testosterone. The SASA was calculated for each of the ternary complex models. Compared with that of the WT, the SASA of the Hap1: G-G-based model increased by 33.03 Å^2^ (Table 3). This suggested that Hap1: G-G likely shortened the hydrophobic side chain of 5α-RD2, thereby fine-tuning the 5α-RD2’ conformation. In comparison, Hap2: C-A and Hap3: G-A only mildly increased the SASA by 2.45 Å^2^ and 10.72 Å^2^, respectively (Table 3). Since testosterone is a hydrophobic molecule, the larger the SASA is, the less likely testosterone is to be recruited near the enzyme; thus, Hap1: G-G may cause a greater decrease in 5α-RD2 enzymatic binding with testosterone than Hap2: C-A or Hap3: G-A.Table 3SASA of the ternary models composed of 5α-RD2, NADPH and testosteroneGroupSASA (Å^2^)ChangeWT: C-G (L89, R227)25,700.67–Hap1: G-G (L89V)25,733.70 + 33.03Hap2: C-A (R227Q)25,703.12 + 2.45Hap3: G-A (L89V, R227Q)25,711.39 + 10.72
Interaction analysis of 5α-RD2 with NADPH and testosterone
On the basis of the previously published covalent crystal structure of 5α-RD2, the active region of the enzyme for ligand binding was estimated. The NADPH interaction region might be within a 5 Å radius around position 227 of 5α-RD2, whereas testosterone might be located within a 5 Å radius around position 114 of 5α-RD2. Accordingly, ternary structure models were built.
In the WT 5α-RD2 and NADPH model, several positively charged residues (K35, R94, R105, R171, R179, R227, etc.) formed salt bridges/hydrogen bonds with the phosphate group of NADPH (Fig. 5A), which were essential for maintaining high affinity between 5α-RD2 and NADPH. In addition, peripheral polar residues (N160, N193, Y235, etc.) provided supplementary hydrogen bonds, and hydrophobic residues (F223, L224, etc.) helped anchor the adenine ring (Fig. 5A). The hydrogen bonding interaction between the positively charged R227 and the phosphate group of NADPH was the most critical factor determining the strength of enzyme-cofactor binding. Compared with WT, Hap1: G-G retained R227 hydrogen bond interactions with NADPH, thus showing a relatively limited influence on 5α-RD2 binding with NADPH (Fig. 5B and E). However, Hap2: C-A significantly disrupted the original network of positive charge and hydrogen bond interactions (such as R227 and NADPH) associated with the phosphate group of NADPH, leading to the loss of key salt bridges and hydrogen bonds with NADPH, thus directly significantly weakening the binding stability of 5α-RD2 and NADPH (Fig. 5C and F). For Hap3: G-A, owing to arginine substituted by glutamine at position 227, the essential hydrogen bond between R227 and NADPH was lost; however, owing to the effect of leucine replacement by valine at position 89, the remaining hydrogen bond network was relatively complex (Fig. 5D and G).Fig. 5. Interaction analysis of 5α-RD2 with NADPH. A–D Visualization of hydrogen bonds between NADPH and 5α-RD2 derived from WT: C-G (A), Hap1: G-G (B), Hap2: C-A (C), and Hap3: G-A (D), respectively. The NADPH is shown as a gray stick. The key residue R227 that binds NADPH is highlighted with a red box. E–G Comparison of hydrogen bond number and distribution of 5α-RD2 and NADPH for WT: C-G and Hap1: G-G (E), Hap2: C-A (F), and Hap3: G-A (G)
Since testosterone primarily interacts with 5α-RD2 through hydrophobic residues (such as F219, F223, and some conserved hydrophobic sites) and a few hydrogen-bonding residues in both the WT and Hap1/2/3 models, positions 89 and 227 were closer to the NADPH-binding region than to the testosterone-binding region, their direct impact on the testosterone-binding site was relatively minor.
SRD5A2 Hap3: G-A affected 5α-RD2 activity kinetics with a hypomorphic effect
To validate the functional defects caused by the SRD5A2 haplotypes, in vitro protein overexpression experiments were conducted. The empty vector plasmid expressing only GFP served as the mock control, while the WT and truncated mutant (c.679C > T, p.R227*) SRD5A2 plasmids were used as positive and negative controls, respectively. WB results confirmed the validity of the transfection and overexpression of different haplotypes (Fig. 6A). As expected, LC–MS/MS analysis showed that, similar to the mock control, the truncated variant largely decreased DHT production when cultured with various concentrations of NADPH or testosterone (Fig. 6B–C). When testosterone was 10 μM, NADPH with different concentrations, compared to WT (Vmax = 134.28 ± 28.18 pmol/mg protein/min), the Vmax values of Hap1: G-G, Hap2: C-A and Hap3: G-A were 118.71 ± 26.08, 41.80 ± 6.49, and 92.70 ± 19.15 pmol/mg protein/min, respectively. The corresponding catalytic efficiencies decreased to 88.40%, 31.13%, and 69.03%, respectively (Fig. 6B). Similarly, when NADPH was at 500 μM, testosterone with different concentrations, compared to WT (Vmax = 164.23 ± 26.80 pmol/mg protein/min), the Vmax values of Hap1: G-G, Hap2: C-A and Hap3: G-A were 140.85 ± 25.90, 65.94 ± 26.22, and 100.31 ± 35.33 pmol/mg protein/min, respectively. The corresponding catalytic efficiencies decreased to 85.76%, 40.15%, and 61.08%, respectively (Fig. 6C). Collectively, the catalytic efficiencies of the enzymes encoded by Hap1: G-G, Hap2: C-A, and Hap3: G-A all ranked between those of the WT and the truncated variant. Notably, the enzyme encoded by Hap3: G-A exhibited an intermediate catalytic efficiency between that of Hap1: G-G and Hap2: C-A.Fig. 6SRD5A2 Hap3: G-A inhibited the catalytic efficiency of the enzymes. A Protein expression levels of SRD5A2 from all overexpression plasmids. The WB experiments were performed in triplicate. B Catalytic activity of 5α-RD2, measured by dihydrotestosterone (DHT) production, across a range of NADPH concentrations at a fixed testosterone concentration (10 μM). C Catalytic activity of 5α-RD2, measured by DHT production, across a range of testosterone concentrations at a fixed NADPH concentration (500 μM). The mock vector was an empty control vector expressing only GFP. The LC-MS/MS experiments were replicated three times
Discussion
Individuals with DSD, for example, 5α-RD2 deficiency, exhibit high genetic and phenotypic heterogeneity, and the genotype–phenotype correlation remains unclear [34]. In this study, we identified a novel SRD5A2 haplotype (Hap3: G-A) that may have a potential founder effect by characterizing population distribution and differentiation. 3D structure prediction and model construction suggested that Hap3: G-A may have a combined effect on the structure and interaction of 5α-RD2 with ligands, as supported by functional assays directly demonstrating its hypomorphic effect on enzyme activity.
The founder effect is the establishment of a new population from a small group of founding individuals with similar genetic characteristics, partially due to internal preferences resulting in geographical and cultural isolation [35]. To investigate the founder effect in individuals with 5α-RD2 or non-5α-RD2 deficiency DSD, the majority of studies have focused on examining a single variant [16, 36], even though few studies have performed haplotype analysis [10, 25]. For example, Zhu et al. performed haplotyping focusing on p.G203S but did not find a founder effect based on the frequency profile of the specific haplotype [10], probably due to the limited number of enrolled cases. Haplotyping is important for identifying disease-associated variants [37, 38], mapping recombination patterns [39], tracing ancestry [40–42], and offering valuable insights into the pathogenesis of diseases [26, 43, 44]. Our previous multicenter study reported that the unique prevalence of c.680G > A in the Chinese population suggested a putative founder effect of SRD5A2 [24, 45]. On the basis of large-scale participants, including 216 cases and 2,794 controls, this study identified a SRD5A2 haplotype in linkage (Hap3: G-A, represented by c.265C > G and c.680G > A), which was frequently found in individuals with 5α-RD2 deficiency but had a low frequency in in-house and public controls across multiple ethnogeographic groups. Globally, Hap3: G-A showed different distributions between southern Chinese populations and other populations, probably resulting from a potential founder effect, which might be due to geographical isolation from a large coefficient of inbreeding [14, 46, 47]. This finding was further supported by the high population differentiation between southern Chinese individuals and populations of African, European or American ancestry, in contrast to the low differentiation between southern Chinese individuals and populations of East or South Asian ancestry. The distinctive distribution of Hap3: G-A between North China and South China seems to indicate further splitting of the Chinese population into subgroups with divergent genetic backgrounds and multiethnic origins. These findings underscore the importance of considering regional genetic variations when studying the epidemiology and clinical presentation of DSD. The existence of the prevalent haplotype might indicate a shared ancestry among cases. To the best of our knowledge, this is the first study investigating SRD5A2 with a founder effect from the perspective of haplotyping, thus providing new insights into the diverse genetic background of 5α-RD2 deficiency.
Several molecular pathways, including steroid hormone biosynthesis pathway and others, are involved in the sex development process. These signalling pathways form a complex regulatory network, affecting the presence and variability of phenotypes in DSDs. Our previous analysis of the phenotype-genotype correlation of SRD5A2 in individuals with 5α-RD2 deficiency found that c.680G > A predominated in cases with normal meatus or distal hypospadias compared with those with proximal hypospadias [24]. In addition, c.265C > G was traditionally considered a benign variant because of its high frequency in the normal population; thus, its correlation with phenotype has rarely been investigated [48]. Interestingly, we found that Hap3: G-A of SRD5A2 was prevalent in individuals with 5α-RD2 deficiency and exhibited a trend of higher frequency in non-5α-RD2 deficiency DSD cases without known DSD-related gene variants than in those with known DSD-related gene variants, suggesting that Hap3: G-A may play an important role in the regulatory network underlying DSD. Variants in common DSD-related genes, such as AR, have been detected in combination with additional variants in other genes in DSD individuals with a broad range of phenotypes, implying a role of oligogenic inheritance in DSD [49]. The effects of some variants of DSD-related genes might be minor or moderate in affecting disease pathogenesis; however, these variants can accumulate with Hap3: G-A of SRD5A2 or other gene variants to perturb the regulatory network of sex development, thus contributing to disease risk. An analysis of the phenotype spectrum demonstrated that the majority of homozygotes of Hap3: G-A presented microphallus, and nearly half of them manifested isolated microphallus. It was expected that homozygotes of Hap3: G-A might exhibit relatively mild phenotypes. These findings provide a novel perspective for elucidating the genetic complexity and phenotypic variability observed in DSD individuals.
Structural prediction analysis provides theoretical insights for understanding the function of SRD5A2 haplotypes. Intramolecular hydrogen bonds primarily stabilize tertiary structures [50], whereas intermolecular hydrogen bonds are essential for molecular recognition, signal transduction, and enzymatic catalytic activity [51, 52]. Hap1: G-G mainly increased the SASA of the complex but mildly influenced the stability of the enzyme and its interactions with the substrate testosterone and cofactor NADPH, which is consistent with its mild effect on the catalytic activity observed in kinetic assays. In contrast, Hap2: C-A was predicted to disrupt 5α-RD2 binding with NADPH by reducing the number of hydrogen bonds, which is consistent with its largely decreased catalytic activity. Hap3: G-A lost significant interaction with NADPH because of R227Q, whereas L89V increased 5α-RD2 stability locally by shortening the side chain, which compensated for the disruption effect of R227Q. Overall, Hap3: G-A seemed to have a combined effect on the structure and interaction of 5α-RD2, rather than simply additive effects of its constituent variants. Further kinetic assays confirmed that the catalytic efficiencies of the enzymes encoded by Hap1: G-G, Hap2: C-A, and Hap3: G-A all ranked between those of the WT and the truncated variant, suggesting a hypomorphic effect of these haplotypes. Notably, the enzyme encoded by Hap3: G-A exhibited an intermediate catalytic efficiency between that of Hap1: G-G and Hap2: C-A, probably attributable to the differential effects of these haplotypes on the binding of 5α-RD2 with testosterone or NADPH.
To further elucidate the associations between specific haplotypes and clinical phenotypes, expanding the cohort by recruiting more individuals with 5α-RD2 deficiency and other DSDs would be beneficial. Future research should focus on validating structural changes via experiments or molecular dynamics simulations to gain a deeper understanding of kinetic properties and functional mechanisms. In addition to in vitro experiments, in vivo validation in animal models to elucidate the role of specific haplotypes, as well as their interactions with DSD-related regulatory networks, might be considered.
Perspectives and significance
This study identified a novel hypomorphic SRD5A2 haplotype with a potential founder effect, which was composed of common variants. These findings provide valuable insights for understanding complex genotype–phenotype correlations through haplotyping of specific genes, facilitating genetic counselling, early intervention and clinical management of individuals with 5α-RD2 deficiency or even other DSDs. Notably, in future clinical practice, analyses of genetic factors underlying 5α-RD2 deficiency and other DSDs should shift from focusing solely on individual variants to accounting for the interactions and combined effects of multiple variants, particularly those in linkage, potentially offering new clues for therapeutic targets for affected individuals.
Conclusions
This study identified a novel SRD5A2 haplotype with a potential founder effect in individuals with 5α-RD2 deficiency from South China, which was composed of altered alleles from relatively common variants, c.265C > G and c.680G > A. Structural prediction indicated molecular changes underlying the potential role of this haplotype in gene dysfunction, which was supported by functional assays directly demonstrating its hypomorphic effect on enzyme activity. These findings provide valuable insights for understanding genotype–phenotype correlations, genetic counselling, early intervention and clinical management of individuals with 5α-RD2 deficiency or even other DSDs.
Supplementary Information
Additional file 1. Additional file 2. Additional file 3. Additional file 4. Additional file 5. Additional file 6. Additional file 7. Additional file 8.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Chai Discovery JB, Dent J, Mc Partlon M, Meier J, Reis V, Rogozhnikov A, Wu K. Chai-1: Decoding the molecular interactions of life. 2024. Preprint at https://www.biorxiv.org/content/10.1101/2024.10.10.615955 v 2.full.pdf.
