Common variation in meiosis genes shapes human recombination and aneuploidy
Sara A. Carioscia, Arjun Biddanda, Margaret R. Starostik, Xiaona Tang, Eva R. Hoffmann, Zachary P. Demko, Rajiv C. McCoy

TL;DR
This study identifies genetic variants linked to meiotic errors that cause aneuploidy, a major cause of pregnancy loss.
Contribution
The study reveals a common haplotype in SMC1B and other genes that influence both recombination and aneuploidy risk.
Findings
A common haplotype in SMC1B is associated with crossover count and maternal meiotic aneuploidy.
Variants in C14orf39, CCNB1IP1, and RNF212 are linked to meiotic aneuploidy risk.
Aneuploidy-associated variants also show connections to recombination and reproductive aging traits.
Abstract
The leading cause of human pregnancy loss is aneuploidy, often tracing to errors in chromosome segregation during female meiosis1,2. Although abnormal crossover recombination is known to confer risk for aneuploidy3,4, limited data have hindered understanding of the potential shared genetic basis of these key molecular phenotypes. To address this gap, we performed retrospective analysis of pre-implantation genetic testing data from 139,416 in vitro fertilized embryos from 22,850 sets of biological parents. By tracing transmission of haplotypes, we identified 3,809,412 crossovers, as well as 92,485 aneuploid chromosomes. Counts of crossovers were lower in aneuploid versus euploid embryos, consistent with their role in chromosome pairing and segregation. Our analyses further revealed that a common haplotype spanning the meiotic cohesin SMC1B is associated significantly with both crossover…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrenatal Screening and Diagnostics · Microtubule and mitosis dynamics · Genomic variations and chromosomal abnormalities
Main
Despite their critical role in encoding genetic information, chromosomes frequently mis-segregate during human meiosis, producing abnormalities in chromosome number—a phenomenon termed aneuploidy. Aneuploidy is the leading cause of human pregnancy loss, as well as the cause of genetic conditions such as Klinefelter, Turner and Down syndromes^1,2^. It is estimated that only approximately half of human conceptions survive to birth, primarily because of the abundance of aneuploidies that are inviable in early gestation^5,6^.
Work in humans and model organisms has established that one risk factor for aneuploidy involves variation in the number and location of meiotic crossover recombination events, especially in the female germline^3,4^. Notably, female meiosis initiates in fetal development, when replicated homologous chromosomes (homologues) pair and establish crossovers, which, together with cohesion between sister chromatids, hold homologues together in a ‘bivalent’ configuration. Homologues segregate (meiosis I) upon ovulation after the onset of puberty, whereas sister chromatids segregate (meiosis II) after fertilization. The physical linkages formed by meiotic crossovers help stabilize paired chromosomes during this prolonged period of female meiotic arrest^7^. Cohesin complexes, loaded in developing fetal oocytes, link sister chromatids and are crucial for chromosome synapsis and crossover formation^8,9^. Failure to form bivalents due to lack of crossovers^10^ or their suboptimal placement^11^, as well as age-related cohesin deterioration^12^, can lead to premature separation of sister chromatids and the related phenomenon of reverse segregation, which together represent the predominant mechanisms of maternal meiotic aneuploidy^13^.
Although producing sex-specific recombination maps and revealing associations with crossover phenotypes at meiosis-related genes, the largest studies of crossovers in living human families lacked aneuploid participants and only speculated about such relationships^14,15^. Much of current knowledge about the connection between human recombination and aneuploidy, as well as their genetic bases, thus comes from smaller samples of people living with survivable aneuploidies, limiting statistical power. By contrast, recent advances in single-cell sequencing have enabled simultaneous discovery of crossovers and aneuploidies in sperm and eggs, but are typically relegated to small numbers of gametes (in the case of oocytes) or small numbers of donors, hindering understanding of variability and potential shared genetic architecture of these phenotypes^16–18^.
Clinical genetic data from pre-implantation genetic testing (PGT) of in vitro fertilized (IVF) embryos help overcome these limitations and offer an ideal resource for characterizing aneuploidy and mapping meiotic crossovers at scale. Here we used single nucleotide polymorphism (SNP) array-based PGT data from 139,416 blastocyst-stage embryo biopsies and 22,850 sets of biological parents to (1) map recombination and aneuploidy, (2) test their relationship quantitatively and (3) discover genetic factors that modulate their incidence and features. Our analysis revealed an overlapping genetic basis of female recombination and aneuploidy formation involving common variation in key meiotic machinery. Together, our work offers a more complete view of the sources of variation in the fundamental molecular processes that generate genetic diversity while impacting human fertility.
Meiotic aneuploidy is common in embryos
Seeking insight into meiotic crossover recombination and the origins of aneuploidies, we performed retrospective analysis of data from PGT. Specifically, these data comprised SNP microarray genotyping of bulk (approximately six cells) trophectoderm biopsies from 156,828 blastocyst-stage embryos (5 days post-fertilization), as well as DNA isolated from buccal swabs or blood from both biological parents (24,788 patient–partner pairs) (Fig. 1a and Supplementary Figs. 1 and 2; Supplementary Methods). We developed a hidden Markov model (HMM), called karyoHMM, to trace the transmission of parental haplotypes to sampled embryos and thereby identify aneuploidies and crossover recombination events. Specifically, we modelled transitions between the haplotypes transmitted from the same parent as crossovers and inferred the chromosome copy number that best explained the embryo data (Fig. 1b and Supplementary Fig. 3; Supplementary Methods).Fig. 1. Data from PGT of IVF embryos offer insight into crossover recombination and aneuploidy.Colours indicate maternal (purple) versus paternal (blue) data features. a, Data comprise SNP microarray genotyping of trophectoderm biopsies from sibling embryos, as well as DNA from parents. b, Tracing transmission of parental haplotypes from parents to embryos reveals evidence of crossovers, as well as aneuploidies. c, Aneuploidies primarily involve gain or loss of maternal homologues and are enriched on particular chromosomes. Complex aneuploidies (more than five affected chromosomes) and genome-wide ploidy abnormalities (for example, triploidy) are excluded (Extended Data Fig. 1). d, Aneuploidies affecting maternal homologues increase with maternal age, whereas aneuploidies affecting paternal homologues exhibit no significant relationship with paternal age. e, Maternal crossovers exceed paternal crossovers. Embryos with crossover counts outside of 3 s.d. from the sex-specific mean are excluded. f, Crossover counts differ between disomic chromosomes of euploid (n = 46,856) and aneuploid (n = 34,542) embryos containing at least a single maternal crossover (two-sided Poisson GLMM), but the proportion of crossovers occurring within hotspots does not (two-sided Gamma GLMM). Error bars indicate 95% confidence intervals. Illustration in a adapted from NIH BioArt Source (https://bioart.niaid.nih.gov/bioart/209) under a Public Domain licence CC0 1.0.
Applying this method to a dataset where low-quality samples were removed (139,416 remaining embryos; Supplementary Methods), we identified 41,480 (29.8%) embryos with at least one aneuploid chromosome (92,485 aneuploid chromosomes; Extended Data Fig. 1). Trisomies exceeded monosomies (57,974 trisomies, 34,511 monosomies; ratio, 0.626; 95% confidence intervals, 0.624, 0.630; two-sided binomial test, P < 1 × 10^−100^), indicative of selection before blastocyst formation^6^. However, trisomies and monosomies of all individual autosomes and sex chromosomes were detected within the sample (Fig. 1c). Aneuploidies largely involved gain or loss of maternal versus paternal homologues (84,044 maternal:8,441 paternal; ratio, 0.909; 95% confidence intervals, 0.907, 0.911; two-sided binomial test, P < 1 × 10^−100^) and were strongly enriched for chromosomes 15, 16, 21 and 22, replicating previous literature^19^.
We also replicated the association between maternal age and the incidence of aneuploidies affecting maternal homologues (binomial generalized linear mixed model (GLMM), \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.235, s.e. = 2.19 × 10^−3^, P < 1 × 10^−100^; Supplementary Table 1)^13^. The data were well fit by a model with a quadratic term for maternal age (Fig. 1d, Supplementary Fig. 4 and Supplementary Table 1; Supplementary Methods). Positive associations with maternal age were also significant when stratifying the phenotype to maternal meiotic aneuploidy of individual chromosomes (Supplementary Table 1). Further supporting selection against meiotic aneuploidies, per patient rates of maternal meiotic aneuploidy were inversely associated with per-cycle embryo counts, even when controlling for maternal age (binomial GLMM, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = −0.030, s.e. = 6.88 × 10^−3^, P = 1.29 × 10^−5^). Despite the statistical power afforded by the large sample size, we observed no significant association between paternal age and aneuploidies affecting paternal homologues (binomial GLMM, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = −1.06 × 10^−3^, s.e. = 0.013, P = 0.936; Fig. 1d and Supplementary Table 1), consistent with previous findings^19^. The absence of paternal age association also held for the sex chromosomes, where paternal meiotic aneuploidies were relatively more common (binomial GLMM, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 2.14 × 10^−3^, s.e. = 0.020, P = 0.914; Supplementary Table 1).
Aneuploid embryos possess fewer crossovers
Previous studies have shown that abnormal number or placement of crossovers confers risk for meiotic aneuploidy^1,4^. These include studies of survivable trisomies^20,21^, gametes^2,16,17^ and embryos^16,22^, which broadly demonstrated that aneuploid chromosomes are depleted of crossovers compared with corresponding disomic chromosomes.
Across 46,861 euploid embryos (and requiring at least three sibling embryos; Supplementary Methods), we identified 2,310,257 maternal- and 1,499,155 paternal-origin autosomal crossovers (3,809,412 total) at a median resolution of 99.43 kilobase pairs (kbp) (Fig. 1e). The mean counts of sex-specific crossovers per meiosis (49.30 maternal, 31.99 paternal), as well as their genomic locations (Spearman correlation (r) at 100-kbp resolution: 0.96 maternal, 0.98 paternal), were consistent with previous pedigree-based studies of living human cohorts^14,15^. We also observed substantial proportions of chromosomes that lack detected crossovers from a given parent (maternal = 1.67–35.56%, paternal = 7.83–51.77%), particularly among short chromosomes such as chromosomes 21 and 22 where aneuploidies are common (Extended Data Fig. 2). Acknowledging the limited resolution of the genotyping array at chromosome ends, these estimates conform with observations from living human pedigrees^14^.
Previous literature offers conflicting evidence about the relationship between counts of meiotic crossovers and maternal age, with some studies reporting a positive association^14,15,23^ and others reporting a negative association^24^. As those studies focused largely on living families, positive associations were interpreted typically as evidence of selection against aneuploid embryos, which possess fewer crossovers on average and increase in frequency with maternal age. Within our sample, we observed no significant association between maternal age and number of maternal crossovers (Poisson GLMM, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = −2.62 × 10^−5^, s.e. = 1.68 × 10^−3^, P = 0.988). This observation held even when restricting analysis to euploid embryos (Poisson GLMM, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 5.12 × 10^−4^, s.e. = 1.43 × 10^−3^, P = 0.721), offering a point of evidence against the hypothesis that embryonic aneuploidy explains previously reported age associations with crossovers.
We used these crossover data to perform genome-wide association studies (GWAS) across four phenotypes: mean count of autosomal crossovers across euploid embryos (crossover count); fraction of crossovers within recombination hotspots based on published genetic maps (hotspot occupancy); mean timing of DNA replication at crossover sites (replication timing); and mean guanine–cytosine content ±500 bp around crossover sites (GC content; Supplementary Methods). We identified 15 unique association signals achieving genome-wide significance (P < 5 × 10^−8^), all of which replicated previous findings^14,25^ (Supplementary Table 2 and Extended Data Figs. 3–6), including a haplotype spanning RNF212 with opposing directions of association with maternal versus paternal recombination rates (lead SNP rs3816474; maternal \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = −0.089 ± 0.013 s.e., P = 1.84 × 10^−11^; paternal \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.186 ± 0.013 s.e., P = 1.76 × 10^−47^; Extended Data Fig. 3). Complementing these GWAS, we performed transcriptome-wide association studies (TWAS) to associate predicted gene expression across several tissues^26^ with recombination phenotypes, identifying 35 unique genes significantly associated with at least one recombination phenotype (P < 3.0 × 10^−6^; Supplementary Table 3; Supplementary Methods). Prominent hits included the synaptonemal complex component C14orf39 (also known as SIX6OS1)^27^ and crossover-regulating ubiquitin ligase CCNB1IP1 (also known as HEI10)^28^, implying that previously reported genetic associations at these loci could be driven by non-coding regulatory mechanisms^14^.
To examine the relationship between crossovers and aneuploidies, we contrasted patterns of crossovers between aneuploid and euploid embryos. One technical limitation for direct detection of crossovers using genetic data from trisomic chromosomes is that crossovers can be missed when both reciprocal products of a single crossover event are transmitted to the embryo^16^. To overcome this concern, we instead contrasted counts of crossovers on disomic chromosomes of aneuploid embryos (with aneuploidy affecting a different chromosome) to corresponding disomic chromosomes of euploid embryos. This comparison relies on the previous observation that crossover counts positively covary across chromosomes within meiocytes^29^—a phenomenon that we replicated for euploid embryos within our dataset (intraclass correlation coefficient = 0.176; 95% confidence intervals, 0.11, 0.3; P < 1 × 10^−100^ maternal; intraclass correlation coefficient = 0.088; 95% confidence intervals, 0.05, 0.16; P < 1 × 10^−100^ paternal; Extended Data Fig. 7; Supplementary Methods). As input to our test, we identified 1,505,107 maternal- and 1,007,176 paternal-origin crossovers on disomic chromosomes across 34,542 embryos with at least one aneuploid chromosome (and requiring at least three sibling embryos). Using a Poisson GLMM (Supplementary Methods), we found that the number of crossovers was significantly lower on the disomic chromosomes of aneuploid embryos relative to euploid embryos ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.105 difference in marginal means ± 6.923 × 10^−5^ s.e.; P = 4.64 × 10^−150^; Fig. 1f). These results are consistent with the understanding that reduction in crossovers—and absence of crossovers, in particular^10^—confers risk for meiotic aneuploidy.
SMC1B variants associate with aneuploidy
Previous studies have suggested that the incidence of female meiotic aneuploidy may be individual-specific, even after accounting for maternal age^30^. To test this hypothesis, we fit a quasi-binomial generalized linear model (GLM) to the per patient counts of embryos affected versus unaffected with maternal meiotic-origin aneuploidy, including maternal age as a quadratic covariate (Supplementary Methods). Compared with a simulated binomial null distribution, the observed incidence of meiotic aneuploidy was significantly overdispersed across female patients, controlling for maternal age (dispersion parameter (φ) = 1.15, P < 0.01; Supplementary Fig. 5). Overdispersion was also apparent when stratifying analysis to maternal meiotic aneuploidies affecting individual chromosomes (Supplementary Table 4). These observations of overdispersion suggest a role of genetic and environmental factors beyond age in observed variation in maternal meiotic aneuploidy.
To investigate the genetic component, we scanned for variation in maternal genomes associated with the incidence of maternal meiotic aneuploidy. We implemented these association tests using a binomial GLMM, controlling for covariates including maternal age (Supplementary Methods). We first tested for cis-genetic effects on aneuploidy risk by associating incidence of aneuploidy affecting each individual chromosome with maternal genotypes restricted to that chromosome, but we identified no associations achieving genome-wide significance (P < 5 × 10^−8^). Proceeding to a genome-wide analysis considering maternal meiotic aneuploidies affecting any chromosome, we discovered two genome-wide significant associations (Fig. 2a and Supplementary Fig. 6). The first hit (lead SNP rs9351349, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.078, s.e. = 0.014, P = 2.93 × 10^−8^) lies within an intergenic region of chromosome (Chr.) 6 but did not replicate in a held-out test set comprising 15% of female patients ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.021, s.e. = 0.033, P = 0.529). The second hit (lead SNP rs6006737, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.066, s.e. = 0.012, P = 2.21 × 10^−8^) lies on Chr. 22 and replicated in the held-out test set ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.059, s.e. = 0.028, P = 0.033). The minor (C) allele of rs6006737 within our sample is globally common, segregating at high frequencies (gnomAD allele frequency (AF) = 0.78) in African populations but at lower frequencies in European (gnomAD AF = 0.35) and other non-African populations^31^. The effect is additive, whereby for a 40-year-old patient, each copy of the risk allele confers an estimated 1.65% additional average risk of aneuploidy (Fig. 2b). We also detected evidence of a small but statistically significant interaction between maternal age and genotype (likelihood ratio test, χ^2^(1) = 4.24, P = 0.040), indicating that the effect of genotype increases with increasing maternal age ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.026, s.e. = 0.013, P = 0.045). Notably, the size and direction of the main effect of genotype is relatively consistent for aneuploidies of all individual autosomes (Extended Data Fig. 8), suggesting general, genome-wide impacts on meiotic fidelity.Fig. 2. Variants defining a haplotype spanning SMC1B are associated with incidence of maternal meiotic aneuploidy.a, Results of GWAS tests of maternal meiotic aneuploidy and maternal genotype (two-sided binomial GLMM). The dotted line indicates the threshold for genome-wide significance (P = 5 × 10^−8^). b, Fitted relationship between maternal age and incidence of aneuploidy, stratified by maternal genotype at aneuploidy-associated lead SNP rs6006737. c. Regional Manhattan plot depicting the associated locus on Chr. 22, with points coloured based on pairwise linkage disequilibrium with the lead SNP rs6006737 (diamond). Mbp, megabase pairs.
The associated haplotype spans approximately 120 kbp, encompassing four genes: UPK3A, FAM118A, RIBC2 and SMC1B (Fig. 2c). SMC1B encodes a component of the ring-shaped cohesin complex (Fig. 3a), with integral roles in sister chromatid cohesion and homologous recombination during meiosis^32,33^. Smc1b-deficient mice of both sexes are sterile, and female mice exhibit meiotic abnormalities including reduction in crossovers, incomplete chromosome synapsis, age-related premature loss of sister chromatid cohesion and chromosome mis-segregation^32,33^. Previous work in humans demonstrated associations between a less common (gnomAD global AF = 0.06) SMC1B missense variant (rs61735519; r^2^ with GWAS lead SNP rs6006737 = 0.089, D′ = 0.943) and recombination phenotypes^14^. Although imputed with moderate accuracy (dosage r^2^ = 0.80), this missense variant exhibits only modest association with aneuploidy within our sample ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.112, s.e. = 0.040, P = 4.80 × 10^−3^). Meanwhile, the more common aneuploidy-associated haplotype tagged by GWAS lead variant rs6006737 lacks amino acid altering variation (r^2^ < 0.1 for all SMC1B nonsynonymous variants), motivating us to explore potential regulatory mechanisms driving the observed phenotype.Fig. 3. The aneuploidy risk haplotype is associated with lower expression of SMC1B, driven by two independent causal signals.a, Schematic of the meiotic cohesin complex. b, Each copy of the aneuploidy risk allele is associated with reduced expression of SMC1B in human lymphoblastoid cell lines (LCLs; n = 731) from diverse populations. Bars represent the first and third quartiles of the data, white points represent the second quartile (median) and whiskers are bound to 1.5× the interquartile range. c, Pairwise linkage disequilibrium between a set of SNPs including GWAS lead SNP rs6006737 and variants defining fine-mapped eQTL credible sets for SMC1B. d, Fine-mapped eQTL rs2272804 (credible set 1) lies within a putative promoter sequence within open chromatin, while variants defining a second credible set are distributed throughout the upstream region of SMC1B.
Associated haplotype is an SMC1B expression quantitative trait locus
Querying the GWAS lead variant (rs6006737) in data from the Genotype Tissue Expression (GTEx) Project^26^, we observed that the aneuploidy risk allele is associated significantly with reduced expression of SMC1B across diverse tissues. Although invaluable, GTEx largely includes participants of European ancestries, limiting resolution for fine-mapping of causal expression-altering variants. To address this limitation, we also queried the GWAS lead variant in MAGE, which includes RNA sequencing data from lymphoblastoid cell lines from 731 people from 26 globally diverse populations^34^. Consistent with GTEx, rs6006737 is a strong expression quantitative trait locus (eQTL) of SMC1B in MAGE ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = −0.429, s.e. = 0.048, P = 4.68 × 10^−18^; Fig. 3b). Fine-mapping within MAGE decomposes the eQTL signals for SMC1B into two credible sets containing candidate causal variants (coverage = 0.95) (Fig. 3c,d). Whereas one credible set includes nine variants distributed throughout the upstream region of SMC1B, the other is defined by a single SNP (rs2272804; posterior inclusion probability > 0.99), 144 bp upstream of the SMC1B transcription start site.
The regulatory potential and accessibility of the putative promoter CpG island sequence within which rs2272804 resides is supported by published epigenomic and ATAC-seq (assay for transposase-accessible chromatin using sequencing) data from human ovaries^35,36^ (Fig. 3d). We further noted that the SNP lies within a predicted binding motif of ATF1—a transcription factor expressed in female germ cells^37^ and inferred previously to regulate paralogue SMC1A based on chromatin immunoprecipitation sequencing data^38^. Binding of ATF1 to the SNP-encompassing locus is also supported by high-confidence chromatin immunoprecipitation sequencing peaks in induced pluripotent stem cells (WTC11) assayed by the ENCODE Project^38^. By performing an electrophoretic mobility shift assay, we found that a DNA construct containing the alternative allele of rs2272804 had more than threefold lower binding affinity (dissociation constant, KD) for purified human ATF1 in vitro than a construct containing the reference allele (Student’s t-test, mean reference KD = 56.62 nM ± 4.65 s.d., mean variant KD = 173.39 nM ± 15.24 s.d., P = 2.60 × 10^−4^), consistent with the observed eQTL effect (Extended Data Fig. 9). Taken together, these results suggest a potential non-coding regulatory mechanism underlying the observed genetic association with maternal meiotic aneuploidy.
TWAS reveals new links to meiosis genes
Motivated by our observations at SMC1B, we next sought to examine whether other cis-regulatory effects on expression could influence aneuploidy risk. We therefore used TWAS to test whether predicted gene expression across tissues is associated with incidence of aneuploidy (Supplementary Methods). Across 16,685 protein-coding genes, we identified two hits achieving transcriptome-wide significance (P < 3 × 10^−6^; Extended Data Fig. 10). Although led by adjacent gene RIBC2 (P = 2.19 × 10^−7^), the peak on Chr. 22 includes SMC1B (P = 7.63 × 10^−6^), replicating our findings from GWAS and downstream functional dissection. We hypothesize that RIBC2 represents a secondary, noncausal association, whereby the same haplotype (and potentially the same causal variant^39^) co-regulates expression of both genes, driving their correlation (Supplementary Fig. 7). The second peak lies on Chr. 14 and is led by C14orf39 (P = 1.65 × 10^−7^), which encodes a component of the synaptonemal complex, which mediates synapsis, recombination and segregation of homologous chromosomes during meiosis^27^. Previous studies have linked rare C14orf39 variants to human infertility^40,41^ and demonstrated associations between common C14orf39 variants and recombination phenotypes^14,25^. Our results connect these findings and show that both rare and common variation influencing female fertility phenotypes can converge on the same meiosis-related genes. Although not achieving transcriptome-wide significance, a third peak, on Chr. 12, includes NCAPD2 (P = 2.16 × 10^−5^), which encodes a regulatory subunit of the condensin I complex, involved in chromosome condensation during both meiotic and mitotic prophase^42^. Together, our findings highlight the role of common non-coding cis-regulatory variation influencing expression of meiosis-related genes in modulating risk of maternal meiotic aneuploidy (Extended Data Fig. 10).
Pleiotropic effects on fertility traits
Given the relationship between crossovers and aneuploidies, we next aimed to contextualize our association findings and examine the potential shared genetic basis with other fertility-related traits. To this end, we identified the lead variant from each genome-wide significant peak in female recombination and aneuploidy GWAS and queried their associations with all recombination and aneuploidy phenotypes, as well as published GWAS of female reproductive ageing and infertility traits (that is, phenome-wide association). Our analysis revealed that the risk allele of the aneuploidy-associated lead SNP rs6006737 is also associated with lower rates of female recombination within our data ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = −0.033, s.e. = 0.011, P = 0.002), consistent with the known role of SMC1B variation in this phenotype^32^. Extending to published GWAS data^43,44^, we observed that the aneuploidy risk allele is additionally associated with greater age at menarche ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.021, s.e. = 0.003, P = 3.82 × 10^−12^) and lesser age at menopause ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = −0.047, s.e. = 0.013, P = 2.06 × 10^−4^) and thus a shorter female reproductive timespan (Fig. 4).Fig. 4. Aneuploidy, recombination and female reproductive ageing traits share an overlapping genetic basis.The lead SNP from each peak from GWAS of aneuploidy and recombination was queried for association with other fertility-related phenotypes (two-sided linear or logistic model from respective GWAS study). Darkness indicates significance of association (P value), while colour indicates direction of association. SNPs are polarized such that the aneuploidy-increasing allele is queried across all traits. Each hit is labelled based on meiosis-related candidate genes within the associated region (top) with the exception of the common 17q21.31 inversion, as well as the locus containing ACYP2 and TSPYL6, where no such candidate is apparent.
Strikingly, three of the genome-wide significant hits for female recombination rate (Supplementary Table 2) also exhibited nominal associations with aneuploidy in consistent direction. The first hit (lead SNP rs4365199; aneuploidy \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.056, s.e. = 0.012, P = 5.58 × 10^−6^; gnomAD global AF = 0.39) comprises a 175-kbp haplotype spanning synaptonemal complex component C14orf39, consistent with our previous TWAS results. The second hit (lead SNP rs12588213; aneuploidy \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.037, s.e. = 0.012, P = 1.46 × 10^−3^; gnomAD global AF = 0.42) comprises a 15-kbp haplotype spanning CCNB1IP1, encoding an E3 ubiquitin ligase demonstrated as essential for crossover maturation and fertility in mice^28^. The last hit (lead SNP rs3816474; aneuploidy \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = 0.041, s.e. = 0.014, P = 5.04 × 10^−3^; gnomAD global AF = 0.22) comprises a 59-kbp haplotype spanning the E3 ubiquitin ligase RNF212, encoding an essential regulator of meiotic recombination that interacts with CCNB1IP1 and helps to designate sites of crossovers versus non-crossovers^45^. Several of these recombination and aneuploidy-associated variants also exhibited secondary associations with ages at menarche and menopause (Fig. 4). Whereas previous studies have reported links between DNA damage response and reproductive ageing^43,46,47^, the inconsistencies in directions of effects in our data imply that the relationship with aneuploidy may be more complex. Moreover, none of the aneuploidy-associated variants exhibited even nominal associations with various definitions of female infertility^48^, potentially reflecting the multifactorial nature of clinical infertility.
Despite our discoveries of several genome- and transcriptome-wide significant loci, the proportion of variance in maternal meiotic aneuploidy explained by genotyped SNPs (that is, SNP heritability) was negligible (h^2^SNP = 0.023 ± 0.024 s.e.; Supplementary Table 5), although SNP heritability of female recombination rate was moderately higher (h^2^SNP = 0.112 ± 0.042 s.e). These estimates are in line with low reported SNP heritabilities of female fertility phenotypes^48^ and the sizeable contribution of environmental factors to maternal aneuploidy risk. Given these observations, we hypothesized that environmental factors and/or rare genetic variation contribute to residual variance in aneuploidy rates, including by effects on meiotic recombination. In support of this hypothesis, individual-specific rates of recombination were inversely associated with aneuploidy, even after controlling for maternal age and all genetic associations (binomial GLMM, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\beta }$$\end{document} = −0.763, s.e. = 0.14, P = 8.15 × 10^−8^; Supplementary Methods), again supporting a broad, protective effect of crossovers on aneuploidy risk.
Evolution of the SMC1B risk allele
The discovery of a common aneuploidy-associated haplotype at SMC1B poses an evolutionary paradox, as alleles that reduce fitness should be subject to negative natural selection. To understand the evolution of aneuploidy-associated alleles, we examined empirical signatures of natural selection and explored the theoretical parameter space that would allow us to reconcile these observations.
One potential model for explaining the maintenance of deleterious variation is positive or balancing selection targeting the same haplotype. Given that linkage disequilibrium between causal variants and tagging variants differs across populations and over time, we focused our empirical analyses on the putative causal expression-altering SNP, rs2272804, that we previously characterized. The (A) allele of rs2272804, associated with lower SMC1B expression and higher aneuploidy risk, is globally common (gnomAD global AF = 0.44), with higher frequencies among African populations (gnomAD AF = 0.71). Inference of the historical frequency trajectory of the derived risk allele based on the ancestral recombination graph (Supplementary Methods) also suggests a higher frequency within an ancestral human population, modestly declining outside of Africa within the last 1,000 generations (Fig. 5a). While the putative ancestral (C) allele appears fixed among extant non-human great ape populations, the variant is polymorphic across high-coverage Neanderthal genomes (Supplementary Methods), and coalescence-based methods estimate that the derived allele originated 910,650 years ago (95% confidence intervals, 825,825–1,004,175)^49^. These patterns of frequency differentiation and coalescence are unremarkable and broadly conform to neutral expectations for a variant at such intermediate frequencies^49^. Similarly, haplotype-based tests for balancing selection revealed no outlier signal in the region of SMC1B (Supplementary Fig. 8; Supplementary Methods). Although we cannot formally exclude the possibility of more complex histories or subtle signatures of positive or balancing selection at this locus, we next considered a theoretical model of negative selection.Fig. 5. Evolutionary modelling of maternal meiotic aneuploidy risk haplotype.a, Estimated allele frequency trajectory of the derived (C) allele at rs2272804 inferred from 100 posterior samples of the ancestral recombination graph at the SMC1B locus. Posterior mean allele frequencies and s.e. were computed within log-spaced time bins, and error bars indicate ±2 s.e. around the posterior mean. b, Relationship between the scaling factor (α) relating fitness to the fitness proxy (number of embryos lacking maternal meiotic aneuploidies) and the selection coefficient (s) for two illustrative female reproductive time windows (Supplementary Note 1). Shaded areas denote 95% confidence intervals of the estimated effect of the haplotype tagged by GWAS lead variant rs6006737 on the resultant mean scaling factor. The horizontal dashed line indicates the theoretical threshold for neutral evolution \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\,\frac{1}{2{N}_{{\rm{e}}}})$$\end{document} , assuming a human effective population size of 10^4^.
Specifically, we formulated a mathematical model (Supplementary Note 1) that integrates over the maternal reproductive timespan and contrasts the potential lifetime production of chromosomally normal embryos between carriers and non-carriers of the aneuploidy risk allele. The ratios of these proxies for relative fitness can be used to derive a proxy for the selection coefficient (sproxy). Based on this model, we estimated that for a historical maternal reproductive window between 18 and 35 years of age, sproxy ≈ 0.01 and increases moderately upon increasing the upper bound of maternal age (Fig. 5b and Supplementary Fig. 9). For a human effective population size (Ne) on the order of 10^4^, a selection coefficient of 0.01 is much greater than the theoretical threshold of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{1}{2{N}_{{\rm{e}}}}$$\end{document} , implying that the allele should be subject to negative selection.
However, although the number of euploid embryos a woman can produce is presumably correlated with fitness, it may constitute only a weak proxy, as realized fitness is also determined by stochastic, environmental and behavioural factors largely independent of genotype. Moreover, pregnancy, childbirth and miscarriage can influence fitness in complex ways, including through impacts on maternal survival, future fertility and parental/grandparental care^50^. We observed that to reach the theoretical threshold for evading negative selection, the selection coefficient (s) must be scaled by a factor (α) ≤ 0.01, relative to sproxy (Supplementary Note 1). Although the historical relationship between fitness and the fitness proxy is unknown, their weak correlation in contemporary populations is evidenced by the lack of association between the aneuploidy risk variant and fertility phenotypes such as number of children ever born and childlessness^51^ (Fig. 4). These results highlight the inadequacy of simplistic proxies of fitness—a limitation long appreciated in the field of life history theory^52^—while reconciling the observation of a common aneuploidy-associated allele.
Discussion
Pregnancy loss is common in humans^5^ and often traces to aneuploidy originating in the maternal germline^1^. Notably, female meiosis initiates in fetal development, when homologous chromosomes pair and establish crossovers, but arrests for decades until ovulation and fertilization. Abnormal number or placement of crossovers predisposes oocytes to chromosome mis-segregation upon meiotic resumption^4,10^. Despite this understanding, the role of common genetic variation in modulating these important molecular processes in humans has remained poorly understood. Through retrospective analysis of large-scale PGT data from human IVF embryos, we mapped genetic variants associated with crossover and aneuploidy phenotypes, revealing an overlapping genetic basis involving key meiosis genes.
Although we measured overdispersion in the age-adjusted rate of aneuploidy per patient and identified genome- and transcriptome-wide significant associations, we were intrigued to find that the SNP heritability of aneuploidy was negligible. This finding aligns with low reported SNP heritabilities of female infertility phenotypes^48^, as well as potential outsize contributions of environmental and rare genetic variation influencing this trait. Nevertheless, given that common and rare variation often converge on the same genes and mechanisms^53^, our results may help inform sequencing-based studies of aneuploidy phenotypes. Supporting a model of mechanistic convergence, rare loss-of-function mutations in several of the genes implicated here have also been linked to meiotic defects and reproductive disorders in smaller clinical cohorts^40,54^. It is also plausible that a fraction of phenotypic variance for aneuploidy risk could trace to common genetic variation that is inaccessible to genotyping arrays and/or short-read sequencing, for example within technically challenging loci such as large segmental duplications, telomeres or centromeres. Recent work offered preliminary evidence that particular centromeric haplotypes are enriched among cases of Trisomy 21 (ref. ^55^). Future applications of long-read sequencing in PGT may enable validation of this hypothesis and extension to inviable aneuploidies.
The observation that alleles associated with lower rates of recombination are associated with higher rates of aneuploidy raises interesting questions about the evolutionary forces that shape recombination and aneuploidy within and between species. In addition to generating new combinations of alleles, recombination may also induce point mutations and structural variation near hotspots of double-strand breaks^14,56^. This suggests a model of stabilizing selection, whereby rates of recombination may be constrained on the lower and upper ends to limit aneuploidy and other deleterious mutations, respectively. More comprehensive models of recombination rate evolution must also consider mechanical constraints such as crossover interference, which reduces occurrence of nearby crossovers, as well as the role of crossovers in facilitating adaptation. By examining divergence across a mammalian phylogeny, a recent study reported signatures of pervasive positive selection on all meiotic components of the cohesin complex (SMC1B, RAD21L1, REC8 and STAG3), which the authors speculated could be explained by intragenomic conflict^57^. Although the asymmetry of female meiosis is susceptible to meiotic drive, the role of meiotic drive in the origins of human aneuploidy remains an important open question.
More broadly, the observation that common genetic variants modulate key reproductive phenotypes such as aneuploidy and recombination poses an intriguing evolutionary paradox, as theory predicts that variation that strongly reduces fitness should be subject to negative selection. We present a theoretical model of negative selection that interprets GWAS effects in terms of potential lifetime production of viable embryos. Our model places an upper bound on the strength of the relationship between this fitness proxy and realized fitness that would allow risk alleles to evade negative selection and reach intermediate frequencies by genetic drift. This framework could be generalized to guide expectations for future studies examining the genetic architecture of aneuploidy and other fertility-related traits.
In summary, our work provides a more complete understanding of common genetic factors that influence risk of aneuploidy—the leading cause of human pregnancy loss. These findings highlight the interplay among the forces of mutation, recombination and natural selection that operate before birth to shape human genetic diversity.
Methods
Detailed methods are provided in the Supplementary Information.
Ethics statement
The Johns Hopkins University Homewood Institutional Review Board (IRB) determined that this research did not qualify as federally regulated human participants research and therefore did not require IRB approval (HIRB00011705). This determination was made with the understanding that the research (1) does not involve a systematic research investigation designed to develop or contribute to generalizable knowledge, or (2) does not obtain information or biospecimens through intervention or interaction with a human participant, and use, study or analyse the information or biospecimens; or does not obtain, use, study, analyse or generate identifiable private information or identifiable biospecimens. Data collection and analysis was carried out in compliance with Natera’s IRB approved protocol (Salus no. 10806) involving Category 4 Exempt Research.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41586-025-09964-2.
Supplementary information
Supplementary InformationThis file contains Supplementary Methods, Supplementary Note 1, Supplementary Figs. 1–13 and References. Reporting Summary Supplementary Tables 1–6This file contains the following supplementary tables: Table 1. Age effects across aneuploidy phenotypes stratified by parental sex and individual chromosomes. Table 2. Lead independent loci found for recombination and aneuploidy GWAS. Table 3. Significant TWAS results for recombination and aneuploidy traits. Table 4. Per-chromosome estimates of overdispersion in rates of aneuploidy, accounting for maternal age. Table 5. SNP heritability estimates for recombination and aneuploidy, as well as other traits from the literature for comparison. Table 6. Estimates of cross-chromosomal covariance in maternal and paternal crossover events. Peer Review File
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Mastrorosa, F. K. et al. Complete chromosome 21 centromere sequences from a Down syndrome family reveal size asymmetry and differences in kinetochore attachment. Preprint at bio Rxiv 10.1101/2024.02.25.581464 (2025).
- 2Biddanda, A., Carioscia, S., Starostik, M. & Mc Coy, R. Data for ‘Carioscia, S. A., Biddanda, A., et al. (2025). Common variation in meiosis genes shapes human recombination phenotypes and aneuploidy risk’. Zenodo 10.5281/zenodo.15114527 (2025).
- 3Biddanda, A. mccoy-lab/natera_genotyping: final submission. Zenodo 10.5281/zenodo.17429676 (2025).
- 4Biddanda, A. & Mc Coy, R. mccoy-lab/natera_recomb: final submission. Zenodo 10.5281/zenodo.17429678 (2025).
- 5Biddanda, A. mccoy-lab/karyohmm: final submission. Zenodo 10.5281/zenodo.17429669 (2025).
- 6Carioscia, S., Biddanda, A. & Starostik, M. mccoy-lab/natera_aneuploidy: final submission. Zenodo 10.5281/zenodo.17429672 (2025).
