Haplotype-based analysis distinguishes maternal-fetal genetic contribution to pregnancy-related outcomes
Amit K. Srivastava, Julius Juodakis, Pol Sole-Navais, Jing Chen, Jonas Bacelis, Kari Teramo, Mikko Hallman, Pal R. Njølstad, David M. Evans, Bo Jacobsson, Louis J. Muglia, Ge Zhang, Michael P. Epstein, Heather J Cordell, Michael P. Epstein, Heather J Cordell, Heather Cordell

TL;DR
A new haplotype-based method distinguishes maternal and fetal genetic contributions to pregnancy outcomes, revealing distinct roles in gestational duration and fetal size.
Contribution
A novel haplotype-based approach is introduced to estimate maternal and fetal genetic contributions in mother-child pairs, overcoming limitations of genotype-based methods.
Findings
Gestational duration variance is mainly attributable to maternal transmitted and non-transmitted haplotypes.
Fetal size measurements are primarily influenced by maternal transmitted and paternal transmitted haplotypes.
Birth length and head circumference show substantial influence of correlated maternal-fetal genetic effects.
Abstract
Genotype-based approaches for the estimation of SNP-based narrow-sense heritability (h^2) have limited utility in pregnancy-related outcomes due to confounding by the shared alleles between mother and child. Here, we propose a haplotype-based approach to estimate the genetic variance attributable to three haplotypes - maternal transmitted (h^m12), maternal non-transmitted (h^m22) and paternal transmitted (h^p12) in mother-child pairs. We show through extensive simulations that our haplotype-based approach outperforms the conventional and contemporary approaches for resolving the contribution of maternal and fetal effects, particularly when m1 and p1 have different effects in the offspring. We apply this approach to estimate the explicit and relative maternal-fetal genetic contribution to the phenotypic variance of gestational duration and gestational duration-adjusted fetal size…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig 1
Fig 2
Fig 3
Fig 4
Fig 5- —http://dx.doi.org/10.13039/100009633Eunice Kennedy Shriver National Institute of Child Health and Human Development
- —http://dx.doi.org/10.13039/100000861Burroughs Wellcome Fund
- —http://dx.doi.org/10.13039/100000865Bill and Melinda Gates Foundation
- —http://dx.doi.org/10.13039/100011636March of Dimes Prematurity Research Center Ohio Collaborative
- —http://dx.doi.org/10.13039/100007172Cincinnati Children's Hospital Medical Center
- —Research council of Norway
- —Research Council of Norway
- —Jane and Dan Olsson Foundations
- —Swedish Medical Research Council
- —Swedish government grants
- —Norwegian Research Council/FUGE
- —UK Medical Research Council and Wellcome Trust
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · RNA modifications and cancer · Birth, Development, and Health
Introduction
Narrow sense heritability (h^2^) is the proportion of phenotypic variance in a population attributable to additive genetic values (breeding values) [1]. Generally, the concept of the h^2^ estimation comes from balanced designs – regression of a child’s phenotype on mid-parent phenotype, correlation of full or half sibs and differences in the correlation of monozygotic and dizygotic twins [1]. However, in a population with mixed relationships, linear mixed model (LMM) is the most flexible approach accounting for both fixed and random effects [1–5].
Over the last decade, various methods [6] including Genome-based Restricted Maximum Likelihood (GREML) [7, 8], Linkage Disequilibrium Adjusted kinships (LDAK) [9], threshold Genomic Relatedness Matrices (Threshold-GRMs) [10], LD Score regression (LDSC) [11] and Phenotype Correlation-Genotype Correlation (PCGC) [12] have been developed to estimate SNP-based narrow-sense heritability (commonly known as SNP-based heritability or SNP-heritability - ) [13]. In addition, variants of these approaches such as GREML-MAF stratified (GREML-MS) [14], GREML-LD and MAF stratified (GREML-LDMS) [15] and LDAK-MAF stratified (LDAK-MS) [16] have enabled partitioning of the genetic variance into additive and non-additive components as well as variance components attributable to chromosomes, genes and inter-genic regions. The above approaches have helped explain a large proportion of the missing heritability in various complex diseases and quantitative traits [8–11,13,16–18]. Nevertheless, conventional approaches utilizing an individual’s genotype information are less suited for pregnancy-related outcomes which are jointly influenced by direct fetal and indirect parental genetic effects [19–22]. In recent years, several studies using genotype information in mother-child duos [19,21,23–26] and parent-child trios [27] have examined the contribution of parental genetic effects [28, 29] and fetal genetic effects in various pregnancy-related outcomes. However, these approaches are based on several assumptions including equal effects of maternal and paternal transmitted alleles in child. Hence, the estimation of heritability in pregnancy-related outcomes demands a direct approach with relaxed assumptions.
Here, we consider mother-child pair as a single analytical unit consisting of three haplotypes corresponding to maternal transmitted (m1), maternal non-transmitted (m2) and paternal transmitted (p1) alleles [30–32]. Use of such an analytical unit provides an advantage over conventional approaches based on individual’s genotype information by avoiding the confounding of m1 which can influence pregnancy-related outcomes through both the mother and child (Fig 1A) [22]. We generate three separate genetic relatedness matrices M1, M2 and P1 using only m1, only m2 and only p1, respectively. We fit all three matrices simultaneously in a linear mixed model (LMM) to estimate variance attributable to each haplotype (Fig 1B). Although our approach doesn’t directly estimate SNP-heritability, we use , and to represent variance attributable to m1, m2 and p1 respectively for the comparison purposes. We compare the behavior of our newly developed haplotype-based genome-wide complex trait analysis approach (H-GCTA) with existing genotype-based approaches such as Genome-wide Complex Traits Analysis (GCTA) [7, 8] and Maternal-Genome-wide Complex Traits Analysis (M-GCTA) [21,24] approach using simulated phenotypes with varying contributions and correlations of maternal and fetal genetic effects. We show that H-GCTA outperforms the conventional and other contemporary approaches, particularly when the maternal and paternal transmitted alleles have different effects (e.g., parent-of-origin effects - POEs) on a fetal trait and traits with joint maternal-fetal effects.
Comparison of conventional (GCTA) and H-GCTA approach.A) Schematic representation of the difference between the conventional genotype-based and newly developed haplotype-based analysis approach; the left part of the figure represents the conventional approach based on genotypes of mother and child separately and the right part represents haplotype-based analysis by treating mother/child pairs as analytical units. Green vertical arrow represents maternal genetic effects in fetus, whereas Blue one represents fetal genetic effects in mother during pregnancy. Red and Golden curved arrows represent maternal and fetal genetic effects in mother and fetus, respectively. Red and Indigo slant arrows represent the environmental effects on mother and fetus, respectively. m1 (f1): Maternal transmitted alleles; m2: Maternal non-transmitted alleles; f2 (p1): paternal transmitted alleles; E: Environmental factors. B) Schematic representation of the difference between conventional approach of heritability estimation utilizing genotype-based GRMs and our approach utilizing haplotype-based GRMs (representing the example of mother-child duos). While Conventional GCTA approach fits individual’s genotype-based GRM separately in mothers and children (left side), haplotype-based approach fits three haplotype-based GRMs together (right side). σg2: phenotypic variance attributable to mothers’ or children’s genotypes; σm12, σm22 and σp12: phenotypic variance attributable to m1, m2 and p1 respectively; σe2: phenotypic variance attributable to E.
We further apply our approach to a cohort of 10,375 mother-child pairs to estimate the explicit and relative contribution of maternal-fetal genetic effects to the phenotypic variance of gestational duration and gestational duration adjusted fetal size measurements at birth, including birth weight, birth length and head circumference. Our results suggest that genetic variance in gestational duration is primarily attributable to the maternal genome, i.e., the maternal transmitted (m1) and non-transmitted (m2) alleles, whereas genetic variance in fetal size measurements at birth are largely attributable to fetal genome, i.e., maternal transmitted (m1) and paternal transmitted (p1) alleles. In addition, a higher attribution to m1 as compared to m2 and p1 ( ) suggests a large contribution of correlated maternal-fetal genetic effects to the variance of birth length and head circumference. Our haplotype-based approach provides a direct method with relaxed underlying assumptions to estimate the explicit and relative maternal-fetal contributions to the phenotypic variance of pregnancy-related outcomes.
Results
Heritability estimation using simulated data
We first evaluated the utility and robustness of H-GCTA using simulated phenotypes based on the real genotype data from a homogenous cohort (Avon Longitudinal Study of Parents And Children; ALSPAC) with 5,369 mother-child pairs and pooled dataset (diverse European populations, including ALSPAC) with 10,375 mother-child pairs. Traits were simulated with varying contributions and correlation of maternal and fetal genetic effects (Table 1 and methods). All traits were simulated with a total genetic variance at 50%, using a randomly selected set of 10,000 causal variants. Traits with correlated maternal-fetal genetic effects were simulated using the same set of causal variants in mother and child. In addition, we also incorporated different levels of POEs (maternal and paternal transmitted alleles had different effects in fetus) in varying proportion of causal variants for traits with only fetal and joint maternal-fetal effects (Table 1 and methods).
Table 1: Genetic models for different simulation conditions.
We compared the performance of H-GCTA with conventional GCTA approach and a contemporary M-GCTA approach for each simulated trait. For each approach, we estimated the genetic variance using three models – GREML, LDAK-Thin (where all pruned SNPs were given equal weights) and LDAK with SNP-specific weights (hereafter referred as LDAK-Weights, where each SNP had different weights based on pair-wise LD) (S1 Fig). Using any particular approach, each model yielded similar results when used with recommended α values (GREML: α = -1.0; LDAK and LDAK-Thin: α = -0.25) which represents the extent to which minor allele frequency (MAF) influences the variance of SNP effects on phenotypes [16]. We observed that the estimated genetic variance was similar in pooled datasets and homogenous cohort ALSPAC. However, due to small sample size, the estimated genetic variance in ALSPAC cohort had larger standard errors (S3 and S4 Figs and S5–S12 Tables). Here, we discuss the results of simulated traits from pooled dataset using GREML (α = -1.0) fitted through GCTA, M-GCTA and H-GCTA approach.
Heritability of simulated traits with only maternal effects
Using conventional approach for maternal traits in mothers and children separately, the estimated SNP-heritability ( ) based on maternal (m) and fetal (f) genotypes was 45.0% (S.E. = 8.6%) and 13.6% (S.E. = 8.6%) respectively (Fig 2A and S13 Table). We also used M-GCTA in mother-child duos to estimate the variance attributable to indirect maternal effect ( ), direct fetal effect ( ) and maternal-fetal covariance ( ) (Fig 2A and S13 Table). Using H-GCTA for maternal traits in mother-child duos, variance attributable to maternal transmitted alleles ( ), maternal non-transmitted alleles ( ), and paternal transmitted alleles ( ) were 25.5% (S.E. = 4.8%), 22.9% (S.E. = 4.5%) and -3.0% (S.E. = 4.2%) respectively (Fig 2A and S13 Table). M-GCTA and H-GCTA accurately distinguished the maternal origin of the simulated traits; however, the conventional GCTA also showed a superficial contribution from the fetal genome (13.6%, approximately one quarter of the based on maternal genotype) due to 50% alleles shared between mother and child (S13 Table).
Comparison of h^2 for simulated traits from pooled dataset – maternal traits, fetal traits, and traits with independent maternal-fetal genetics effects.Comparison of h^2 for simulated traits from pooled dataset, estimated through different approaches fitting GREML (α = -1.0): A) maternal traits; B) fetal traits; C) traits where independent sets of causal variants have effects through mother and fetus; D) traits where same set of causal variants have effects through mother and fetus. For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided). * = (p value <5.0E-02), ** = (p value <1.0E-02), *** = (p value <1.0E-03) and **** = (p value <1.0E-04).
Heritability of simulated traits with only fetal effects
Like maternal traits, we used conventional GCTA to estimate for fetal traits in mothers and children separately. The estimated based on m and f were 10.9% (S.E. = 8.8%) and 51.9% (S.E. = 8.8%) respectively (Fig 2B and S14 Table). Similarly, using M-GCTA for fetal traits in mother-child duos, variance attributable to indirect maternal effect (M’), direct fetal effect (G) and direct-indirect effect covariance (D) were -3.0% (S.E. = 6.0%), 52.7% (S.E. = 6.1%) and 0.0% (S.E. = 4.7%) respectively (Fig 2B and S14 Table). Using H-GCTA in mother-child duos, we estimated the variance of the simulated fetal traits attributable to m1 ( ), m2 ( ) and p1 ( ) (Fig 2B and S14 Table). While conventional GCTA estimated superficial contributions from maternal genotypes besides fetal genotypes, M-GCTA and H-GCTA clearly showed the fetal origin of the simulated phenotypes. As compared to M-GCTA, H-GCTA further resolved almost equal contributions from maternal and paternal transmitted alleles through m1 and p1.
Heritability of simulated traits with independent maternal-fetal genetic effects
Traits with independent maternal and fetal effects were simulated in two ways – using the same set and different sets of causal variants in mothers and children. Using independent sets of causal variants and conventional GCTA approach, the estimated based on m and f were 24.9% (S.E. = 8.7%) and 27.8% (S.E. = 8.7%) respectively (Fig 2C and S15 Table). Using M-GCTA approach, variance attributable to indirect maternal effect (M’), direct fetal effect (G) and direct-indirect effect covariance (D) were estimated as 22.1% (S.E. = 6.6%), 26.8% (S.E. = 5.6%) and -4.7% (S.E. = 5.5%) respectively (Fig 2C and S15 Table). Conversely, H-GCTA estimated the genetic variance attributable to m1 ( ), m2 ( ) and p1 ( ) (Fig 2C and S15 Table). We observed similar results from traits, simulated using same set of causal variants with independent maternal-fetal genetic effects in mothers and children (Fig 2D and S16 Table). We observed that conventional GCTA and M-GCTA showed equal contribution of maternal and fetal genotypes to the phenotypic variance of the simulated phenotypes. As compared to M-GCTA, H-GCTA estimated the contributions of maternal transmitted (m1), maternal non-transmitted (m2) and paternal transmitted alleles (p1) as expected, i.e., 2:1:1 (S5 Fig and S15 and S16 Tables) which demonstrated equal and independent maternal and fetal contributions.
Heritability of simulated traits with correlated maternal-fetal genetic effects
We simulated traits influenced by joint maternal-fetal genetic effects with average negative (-0.5, -1.0) and positive (0.5, 1.0) correlation by using same set of causal variants in mothers and children. For traits with 100% negative correlation of maternal-fetal genetic effects, the estimated using conventional GCTA approach, were 8.2% (S.E. = 9.4%) and 11.4% (S.E. = 9.4%) based on m and f, respectively (Fig 3A and S17 Table). Using M-GCTA approach, the variance attributable to indirect maternal effect (M’), direct fetal effect (G) and direct-indirect effect covariance (D) were estimated as 35.9% (S.E. = 8.4%), 38.3% (S.E. = 6.8%) and -36.1% (S.E. = 6.2%) respectively (Fig 3A and S17 Table). H-GCTA further partitioned the variance into variance components attributable to m1 ( ), m2 ( ) and p1 ( ) (Fig 3A and S17 Table). Although, negative values of usually correspond to noise, they are important for interpretation of results for traits with negative correlation of maternal-fetal genetic effects. We observed that conventional GCTA substantially underestimated the genetic contribution of maternal and fetal genomes. As expected, M-GCTA estimated equal contribution of maternal and fetal genetic effects to the phenotypic variance whereas negative and equal contribution of direct-indirect effect covariance suggested 100% negative correlation of maternal-fetal genetic effects. Similarly, H-GCTA showed no contribution from m1 (due to 100% negative correlation of maternal-fetal genetic effects) and almost equal contribution from m2 and p1 to the phenotypic variance (S5 Fig). We observed similar patterns for traits with 50% negative correlation of maternal-fetal genetic effects (Fig 3B and S18 Table).
Comparison of h^2 for simulated traits from pooled dataset – traits with correlated maternal-fetal genetics effects.Comparison of h^2 estimated through different approaches fitting GREML (α = -1.0) for simulated traits with joint maternal–fetal effects from pooled dataset: A) average correlation = -1.0; B) average correlation = -0.5; C) average correlation = 1.0; D) average correlation = 0.5. For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided). * = (p value <5.0E-02), ** = (p value <1.0E-02), *** = (p value <1.0E-03) and **** = (p value <1.0E-04).
Similarly, conventional GCTA approach estimated based on m and f as 46.2% (S.E. = 8.6%) and 44.8% (S.E. = 8.6%), respectively for traits with 100% positive correlation of maternal-fetal genetic effects (Fig 3C and S19 Table). M-GCTA approach estimated the variance attributable to indirect maternal effect (M’), direct fetal effect (G) and direct-indirect effect covariance (D) as 18.9% (S.E. = 5.0%), 20.6% (S.E. = 5.6%) and 19.7% (S.E. = 4.6%), respectively (Fig 3C and S19 Table). Using H-GCTA, we estimated the genetic variance of simulated traits attributable to m1, m2 and p1 [( ), ( ), ( )] (Fig 3C and S19 Table). While conventional GCTA substantially overestimated the variance attributable to maternal and fetal genotypes, M-GCTA estimated equal contribution of indirect maternal effects, direct fetal effects and direct-indirect effects covariance to the phenotypic variance. Similarly, H-GCTA showed much larger contribution from m1 and equal contribution from m2 and p1 to the phenotypic variance which follows a ratio of 4:1:1 in case of 100% positive correlation of maternal-fetal genetic effects (S5 Fig). Similar patterns were observed for traits with 50% positive correlation of maternal-fetal genetic effects (Fig 3D and S20 Table).
Heritability of simulated fetal traits with POEs
We also estimated genetic variance using GREML for simulated fetal traits with different levels of parent-of-origin effects (POEs) in varying proportion of causal variants. We simulated two scenarios where maternal imprinting was mimicked by reducing the effect of m1 as compared to p1 in 25% and 50% of the causal variants. In each scenario, we generated a range of imprinting patterns such as . The first three conditions represented partial maternal imprinting whereas the last condition, i.e., represented complete maternal imprinting. Using our approach (H-GCTA), we estimated the total ( ) as expected (~ 50%) (Fig 4 and S21 Table). Results from H-GCTA showed that the variance attributable to m1 ( ) decreased whereas the variance attributable to p1 ( ) increased in accordance with the level of imprinting in each scenario (Fig 4A and 4B). We also compared results from our approach with those from GCTA and M-GCTA. While GCTA and M-GCTA were unable to detect contribution of parent-of-origin effects (POEs), H-GCTA detected the variance attributable to POEs as (S21 Table).
Comparison of variance attributable to maternal transmitted (m1), maternal non-transmitted (m2), and paternal transmitted (p1) haplotypes for simulated fetal traits with parent-of-origin effects (POEs).Variance attributable to m1, m2 and p1 estimated through H-GCTA using GREML (α = -1.0) model in simulated fetal traits from pooled dataset: A) 50% causal variants with POEs; B) 25% causal variants with POEs. POEs were incorporated by reducing the effect of m1 as compared to p1 by multiplying effects of m1 with (1 – I) where I is the imprinting factor such as 0.25, 0.50, 0.75 and 1.0. In each scenario, m1 shows either no imprinting, i.e., I = 0.0 (m1/p1=1.0/1.0) or partial imprinting, i.e., I = 0.25-0.75 (m1/p1=0.75/1.0−0.25/1.0) or complete imprinting, i.e., I = 1.0 (m1/p1=0.0/1.0).
Heritability of simulated traits with correlated maternal-fetal genetic effects and POEs
To further investigate the intriguing relationships of maternal and fetal genetic effects in dyadic traits, we simulated traits with joint maternal-fetal genetic effects with average correlation = 1.0 and different levels of parent-of-origin effects (POEs) [ ]. For simplicity, we assumed that all causal variants exhibited POEs.
For traits with 100% correlated maternal-fetal genetic effects and varying levels of POEs, conventional GCTA approach substantially overestimated whereas M-GCTA and H-GCTA approach slightly overestimated . While GCTA and M-GCTA failed to detect contribution of POEs, our approach (H-GCTA) clearly identified that variance attributable to m1 ( ) decreases with increasing levels of maternal imprinting and eventually becomes equal to the variance attributable to m2 ( ) or p1 ( ) in case of complete maternal imprinting ( ). As expected, relative variance attributable to m1, m2 and p1 changes from in the absence of maternal imprinting to in case of complete maternal imprinting (S5 Fig and S22 Table).
Heritability estimation of pregnancy-related outcomes using empirical data
All analyses for the estimation of genetic variance were performed using imputed genotype data of ~ 11 million markers across 10,375 mother-child pairs. In addition, three MAF cut-offs (0.001, 0.01 and 0.05) yielding approximately 9 million, 7 million and 5.5 million markers respectively, were used for analysis. Only independent mother-child pairs (kinship coefficient < 0.05) were used in analysis and 20 principal components (PCs) were used along with genotype-based GRMs in LMM (S6 Fig). For haplotype-based GRMs, we used 30 PCs (10 PCs corresponding to each haplotype) as covariates in LMM (S6 Fig). Like simulated traits, we estimated genetic variance using three approaches – conventional GCTA approach, M-GCTA approach and H-GCTA approach. For each approach, we fitted three models – GREML, LDAK-Thin and LDAK-Weights. Two values of α (-0.25 and -1.0), which represents the extent to which minor allele frequency (MAF) influences the variance of SNP effects on phenotypes were used for each model. Here, we describe results based on GRMs calculated through all polymorphic SNPs and three models with recommended α values, i.e., GREML (α = -1.0), LDAK-Thin (α = -0.25) and LDAK-Weights (α = -0.25). Results based on all polymorphic SNPs using other models are provided in S23 Table. Similarly, results based on GRMs calculated through SNPs with MAF > 0.001, SNPs with MAF > 0.01 and SNPs with MAF > 0.05 are provided in S1 Text and S24, S25 and S26 Tables.
Heritability of gestational duration
Using GREML (α = -1.0), the conventional GCTA approach estimated of gestational duration based on m and f – ( ; S.E. = 5.4%) and ( ; S.E. = 5.2%). Our approach (H-GCTA) further resolved the variance attributable to m1 – 17.3% (S.E. = 5.2%;), m2 – 12.3% (S.E. = 5.2%) and p1 – 0.0% (S.E. = 5.0%) (Fig 5A and Table 2A). Results using our approach suggested that the genetic variance in gestational duration was primarily influenced by maternal genome, i.e., the SNPs which influence gestational duration through maternal genetic effect. Comparison with M-GCTA confirmed the results from H-GCTA (Fig 5A and Table 2A). The genetic variance estimated through LDAK-Thin (α = -0.25) was similar to those obtained from GREML (α = -1.0). However, estimates from LDAK-Weights (α = -0.25) were substantially larger than those obtained from GREML (α = -1.0) (Table 2A). This pattern is consistent with previous observation for other traits using LDAK model [16] and thoroughly discussed elsewhere [13].
Table 2: Comparison of h^2 for gestational duration and fetal size measurements at birth.
Comparison of h^2 for gestational duration and fetal size measurements at birth.Comparison of h^2 estimated through different approaches fitting GREML (α = -1.0) for pregnancy-related outcomes in unrelated mother-child pairs (relatedness cutoff > 0.05): A) gestational duration, B) birth weight, C) birth length, and D) head circumference. For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). For conventional GCTA and M-GCTA approach analyses were adjusted for 20 principal components (PCs) whereas for H-GCTA, analyses were adjusted for 30 PCs (10 PCs corresponding to m1, m2 and p1 each). P-values were calculated using z test statistics (one sided). * = (p value <5.0E-02), ** = (p value <1.0E-02), *** = (p value <1.0E-03) and **** = (p value <1.0E-04).
Heritability of gestational duration adjusted birth weight
Analysis using conventional GCTA showed that the estimated of birth weight based on m and f were 16.3% (S.E. = 6.1%) and 34.3% (S.E. = 6.2%) respectively. Using our approach, we further distinguished the variance attributable to m1 – 18.6% (S.E. = 6.1%); m2 – 1.5% (S.E. = 5.6%) and p1 – 13.6% (S.E. = 5.9%) (Fig 5B and Table 2B). The estimates obtained through H-GCTA suggested that genetic variance in birth weight was primarily determined by the fetal genome. Comparison of genetic variance estimated from our approach with those from M-GCTA illustrated that genetic variance in birth weight was mainly attributable to the SNPs which influence birth weight only through direct fetal effect (Fig 5B and Table 2B). Like gestational duration, genetic variance estimated through LDAK-Thin (α = -0.25) was similar to those obtained from GREML (α = -1.0) whereas LDAK-Weights (α = -0.25) estimated larger .
Heritability of gestational duration adjusted birth length
We estimated of birth length based on m ( ; S.E. = 8.3%) and f ( ; S.E. = 8.4%) using conventional GCTA approach. While M-GCTA indicated that birth length is largely influenced by positive maternal-fetal covariance ( ; S.E. = 8.2%), H-GCTA resolved the genetic variance attributable to m1 – 24.2% (S.E. = 8.4%); m2 – 0.0% (S.E. = 7.9%) and p1 – 4.4% (S.E. = 8.0%) (Fig 5C and Table 2C). H-GCTA showed that unlike birth weight, variance in birth length was mainly attributable to m1 with a much smaller attribution to p1 ). According to our simulations, this pattern, i.e., could be generated due to either positively correlated maternal-fetal genetic effects (Fig 3 and S19 and S20 Tables) or POEs (Fig 4 and S21 Table). A previous study using M-GCTA [21] suggested that birth length was influenced by both maternal and fetal genome with different genes contributing to the maternal and fetal effects. However, current study indicates that variance in birth length is primarily attributable to positively correlated maternal-fetal genetic effects along with possible POEs [ ].
Heritability of gestational duration adjusted head circumference
SNP-based narrow-sense heritability ( ) of head circumference estimated using a conventional GCTA approach was 33.5% (S.E. = 10.2%) and 45.7% (S.E. = 10.5%) based on m and f, respectively. Using H-GCTA, we resolved the variance attributable to maternal and fetal genomes into m1 – 36.4% (S.E. = 10.4%); m2 – 5.2% (S.E. = 9.9%) and p1 – 18.8% (S.E. = 10.0%) (Fig 5D and Table 2D). The difference between ( ) and ) suggested that head circumference is largely influenced by fetal genetic effects along with either correlated maternal-fetal genetic effects or possible POEs or both. Similarly, the results from M-GCTA analysis showed approximately equal contribution to variance of head circumference from G and D (Table 2D). The comparison of results from H-GCTA and M-GCTA suggested that head circumference was primarily determined by fetal genome, i.e., phenotypic variance of head circumference was largely influenced by direct fetal effects along with positively correlated joint maternal-fetal effects or POEs. The results also suggested some influence through explicit maternal genetic effect (Table 2D).
Discussion
Unlike widely studied complex human traits [8,9,13,16,33], pregnancy-related outcomes are simultaneously influenced by maternal and fetal genomes. Therefore, conventional genotype-based approaches that were developed to estimate the genetic contribution to phenotypic variance are limited in addressing the confounding of shared alleles between maternal and fetal genomes. Here, we consider the mother-child pair as a single analytical unit with three haplotypes – maternal transmitted (m1), maternal non-transmitted (m2) and paternal transmitted (p1). Using such an analytical unit, we simultaneously disentangle the contribution of m1 (exclusive and joint maternal-fetal effects), m2 (exclusive maternal effect) and p1 (exclusive fetal effect) to the phenotypic variance. Using the simulated data with varying contributions and correlation of maternal and fetal genetic effects, we show that our newly developed H-GCTA approach can explicitly resolve maternal and fetal contributions and outperforms the GCTA and M-GCTA approach, particularly in the presence of POEs (Fig 4 and S21 Table). We further apply our haplotype-based approach to distinguish the genetic contribution of mothers and offspring to the phenotypic variance of gestational duration and gestational duration adjusted fetal size measurements at birth in 10,375 European mother-child pairs. A comparison of results from H-GCTA with those from M-GCTA and conventional GCTA approach reveals that gestational duration is primarily influenced by maternal genome whereas fetal size measurements at birth are largely driven by fetal genome. The new results not only confirm the previous findings from epidemiological [34–40] and genetic [21–25,27,31,41–46] studies but also provide new insights into the genetic architecture of fetal size at birth.
The results based on ~11 million polymorphic SNPs show that approximately 17% and 12% variance in gestational duration is attributable to the m1 and m2, respectively with a minimal contribution from p1 (Fig 5 and Table 2). In contrast, variance in gestational duration adjusted fetal size measurements at birth are mainly contributed by m1 = 19-36%) and p1 ( = 4-14%) with a minimal contribution from m2 (Fig 5 and Table 2). Among fetal size measurements at birth, variance in birth weight has significant contributions from m1 ( = 19%) as well as p1 ( = 14%) whereas variance in birth length and head circumference are mainly attributable to m1 (birth length: = 24%; head circumference: = 36%). These new results suggest that variance in gestational duration is mainly attributable to indirect maternal genetic effects whereas variance in birth weight is mainly attributable to direct fetal genetic effects. In addition, a larger contribution of m1 as compared to m2 and p1 ( ) to the variance of birth length and head circumference suggests a substantial contribution of correlated maternal-fetal genetic effects or possible POEs or both (Table 2). Results using SNPs with MAF > 0.001, SNPs with MAF > 0.01 and SNPs with MAF > 0.05 showed similar results (S24-S26 Tables).
As observed in the analyses of simulated traits, estimated genetic variance observed through GREML (α = -1.0) and LDAK-Thin (α = -0.25) are similar for pregnancy-related outcomes. Consistent with previous reports [13,16], estimated genetic variance using LDAK-Weights (α = -0.25) are up to 30% higher than those using GREML (α = -1.0). However, analysis using LDAK-Thin (α = -0.25) provides slightly lower estimates for gestational duration and birth weight and substantially lower estimates for birth length and head circumference. Similarly, GREML (α = -0.25) and LDAK-Weights (α = -1.0) estimate substantially smaller whereas LDAK-Thin (α = -1.0) estimates substantially larger (S23 Table). These results are consistent with the results of simulated traits in the current study and could be due to misspecification of analytical model [47]. For GREML (α = -1.0) model, we observe the largest estimates of genetic variance for each trait using all polymorphic SNPs, which decreases with increasing threshold of MAF cutoff (number of SNPs decrease with increasing MAF cutoff) (S24-S26 Tables). The decrease in the estimated genetic variance with decrease in number of markers is a general limitation of GREML model which is dependent on several assumptions [13,48].
In general, results for pregnancy-related outcomes follow a similar pattern as those for simulated traits. Specifically, estimated genetic variance of gestational duration and birth weight mimic a pattern similar to the simulated maternal and fetal traits, respectively. Interestingly, estimated genetic variance of head circumference follow a mixed pattern with a large fetal and small maternal genetic influence along with a large influence of maternal transmitted alleles. Irrespective of the analytical models and MAF cut-offs for GRM calculation, H-GCTA estimates a larger contribution of m1 as compared to p1 with almost no contribution of m2 to the phenotypic variance of birth length (Tables 2 and S24–S26). Similarly, M-GCTA estimates a larger contribution of correlated maternal-fetal genetic effects (D) as compared to direct fetal effects (G) with almost no contribution of indirect maternal effect (M’) (Tables 2 and S24–S26). It is possible that there could be complicated maternal-fetal interactions that are not modeled by any of these approaches.
Interestingly, we observe that the contribution of m1 is larger than m2 or p1 for every pregnancy phenotype in the current study. There are several possible explanations for this pattern of results. The most obvious explanation is that m1 can influence a pregnancy phenotype through both the mother and fetus. For example, for a trait mainly defined by the maternal genome like gestational duration, higher contribution of m1 in comparison to m2 could be due to small but non-zero fetal effect of the m1 alleles. Similarly, for traits mainly defined by the fetal genome such as fetal size measurements at birth, higher contribution of m1 in comparison to p1 could be due to small but non-zero maternal effect of the m1 alleles. Assuming maternal-fetal additivity (genetic effects through mother and fetus influence a pregnancy-related outcome in additive manner) with independent maternal-fetal genetic effects and no POEs, ( - ) is equal to . A larger value of as compared to and ( ) suggests presence of either positively correlated maternal-fetal genetic effects or possible POEs or both. For birth length and head circumference, the near zero maternal genetic effect (as estimated by M’ in the M-GCTA analysis) and the null contribution from the maternal non-transmitted alleles (m2) suggests possible existence of POEs along with positively correlated maternal-fetal genetic effects. Besides the above-mentioned explanations, several other biological phenomena such as interaction between SNPs within the mother or fetus (epistasis) and gene-environment interaction may influence the pattern of genetic variance of pregnancy outcomes.
Despite the above advances, our current approach has certain limitations. Our approach by itself cannot explicitly distinguish the contribution of correlated maternal-fetal genetic effects from POEs. Current haplotype-based approach attempts to relax some of the underlying assumptions in conventional and contemporary approaches such as equal effects of maternal and paternal transmitted alleles in fetus and allelic additivity. However, the interpretation of the results requires assumptions on maternal-fetal additivity and random mating population. In addition, heritability estimation in our approach can also be affected if assumptions such as absence of epistasis (gene-gene interaction) and gene-environment interaction are not met.
In conclusion, we introduce an approach (H-GCTA) to partition phenotypic variance of pregnancy outcomes to maternal transmitted, non-transmitted and paternal transmitted alleles in mother/child pairs. This method provides a direct way to dissect the maternal and fetal genetic contributions to pregnancy-related outcomes. In addition, H-GCTA can be extended to parent-child trios to detect the paternal genetic effect (genetic nurturing effect) [20]. In combination with existing approaches such as M-GCTA and Trio-GCTA [21,24,27], H-GCTA can also be used to resolve the contribution of POEs and correlations between maternal and fetal genetic effects. We believe this approach represents a significant enhance to the genetic analytic toolbox of pregnancy-related outcomes that others will also employ moving forward.
Methods
Datasets and quality control
We used genome wide single nucleotide polymorphism (SNP) data from 10,375 mother-child pairs from five European cohorts to distinguish the maternal-fetal genetic contribution to the phenotypic variance of pregnancy-related outcomes such as gestational duration and fetal size measurements at birth (birth weight, birth length and head circumference) (S1 Text and S1 Fig). The study cohorts included Avon Longitudinal Study of Parents and Children (ALSPAC) [49, 50] from UK, Hyperglycemia and Adverse Pregnancy Outcome study (HAPO) [51] from UK, Canada, and Australia, Finnish dataset (FIN) [31,52], Danish Birth Cohort (DNBC) [53], Norwegian Mother, Father and Child Cohort study (MoBa) [54] (S1 Text and S2 Fig and S1-S4 Tables). A detailed description of data sets can be found in supporting data (S1 Text).
Genotyping of DNA extracted from whole blood or swab samples was done on various SNP array platforms such as Affymetrix 6.0, Illumina Human550-Quad, Illumina Human610-Quad, Illumina Human 660W-Quad. SNP array data was filtered based on SNP and sample quality. Quality Control (QC) of genotypes data was performed at two levels – marker level and individual level. Marker level QC was conducted using PLINK 1.9 [55] on the basis of SNP call rate, minor allele frequency (MAF), Hardy-Weinberg Equilibrium (HWE) and individual level QC was done on the basis of call rate per individual, average heterozygosity per individual, sex assignment, inbreeding coefficient. Non-European samples were removed from the study by principal components analysis (PCA) anchored with 1,000 genome samples. Following QC, genotype data of mother-child pairs were phased using SHAPEIT 2 [56]. SHAPEIT 2 automatically recognizes pedigree information provided in the input files. When phasing mother/child duos together, the first allele in child was always the transmitted allele from mother and the second one from father. We imputed the pre-phased genotypes for missing genotypes on Sanger Imputation Server using Positional Burrows-Wheeler Transform (PBWT) software [57]. Haplotype reference consortium (HRC) panel was utilized as reference data for imputation purpose [58]. The phasing and mother-child allele transmission of the imputed alleles were retained from the pre-phasing stage.
QC of phenotype data was conducted considering gestational duration as the primary outcome. Pregnancies involving history of risk factors for preterm birth or any medical complication during pregnancy influencing preterm birth, C-sections and non-spontaneous births were excluded. We also excluded, non-singlet pregnancies, pregnancies who self-reported non-European ancestry and children who could not survive > 1 year. Additionally, gestational duration was adjusted for fetal sex; fetal size measurements at birth such as birth weight, birth length and head circumference were adjusted for gestational duration up to third orthogonal polynomial component. Details of genotype and phenotype QC is provided in supporting data (S1 Text).
Statistical method
We used a linear mixed model (LMM) to estimate the SNP-heritability ( ) of simulated and empirical phenotypes. This model assumes that the phenotype was normally distributed - Y ~ N(μ, V) with mean μ and variance V. We created GRMs from standardized genotypes/haplotypes utilizing the method developed by Yang et.al. [7, 8] and Speed et. al. [9,16]. Each cell of the genotype-based GRM and haplotype- based GRM represented relatedness between two individuals j and k calculated based on genotypes (Equation 1) and haplotypes (Equation 2) respectively.
Where, A_jk_ is the correlation coefficient between two individuals j and k averaged over all SNPs; S is number of SNPs used to calculate relatedness; x_ij_ is the number of copies of the reference alleles in individual j for SNP i (i.e., 0 or 1 or 2); x_ik_ is the number of copies of the reference alleles in individual k for SNP i (0 or 1 or 2); p_i_ is frequency of reference allele of SNP i.
Where, T_jk_ is the correlation coefficient between two mother/child duos or full trios j and k based on maternal transmitted alleles (m1) or maternal non-transmitted alleles (m2) or paternal transmitted alleles (p1) or paternal non-transmitted alleles (p2); S is number of SNPs whose alleles are used to calculate relatedness; c_ij_ is the number of the reference alleles of m1 or m2 or p1 or p2 in mother/child duo or full trio j for SNP i (i.e., 0 or 1); c_ik_ is the number of the reference alleles of m1 or m2 or p1 or p2 in mother/child duo or full trio k for SNP i (i.e., 0 or 1); p_i_ is frequency of reference allele of SNP i.
For genotype-based analysis, we created two GRMs - M and F by utilizing maternal genotypes (m) and fetal genotypes (f) respectively. For haplotype-based analysis, we considered mother-child pair as a single analytical unit consisting of three haplotypes corresponding to m1, m2, and p1. We created three separate GRMs - M1, M2 and P1 using only m1, only m2 and only p1 respectively (Fig 1A and 1B). We fitted mothers’ genotype-based GRM (M) (Equations 3 and 4) and children’s genotype-based GRM (F) (Equations 5 and 6) separately in LMM to estimate phenotypic variance attributable to maternal and fetal genotypes respectively. To calculate explicit contribution of maternal and fetal genomes to the overall narrow-sense heritability, we simultaneously fitted all three matrices (M1, M2 and P1) in LMM and estimated the additive genetic variance attributable to each of the three components (Equation 7, 8).
Where, is a vector of standardized phenotype (n x 1; where, n is number of individuals); X is a matrix of covariates representing fixed effects (n x p; where, p is number of fixed effects); β is a vector of fixed effects (p x 1); is a matrix of mothers’ standardized genotypes (m) (n x S; where, S is number of SNPs); is a matrix of children’s standardized genotypes (f) (n x S); is a matrix of standardized maternal transmitted alleles (m1) (n x S); is a matrix of standardized maternal non-transmitted alleles (m2) (n x S); is a matrix of standardized paternal transmitted alleles (p1) (n x S); ε is a vector of residual effects with e ~ N(0, Iσ^2^e); and are vectors of random effect sizes for maternal genotypes (m) and fetal genotypes (f); u_m1_, u_m2_ and u_p1_ are vectors of random effect sizes for maternal transmitted (m1), maternal non-transmitted (m2) and paternal transmitted (p1) alleles respectively (m x 1); is Variance-Covariance matrix of phenotypes; M, F, M1, M2 and P1 are GRMs generated from , , , and respectively (e.g., ); σ^2^ are the variances of the respective components.
As previously reported(9, 16, 47), genetic architecture is parametrized on MAF and pair-wise LD, assuming , where , and are the effect size, weight and reference allele frequency of SNP i and α is the scaling factor which represents the extent to which MAF influences the variance of per-allele effect of SNP i [var( )]. We calculated SNP-specific weights using LDAK and scaled GRMs with two α values ( ) in each model. Each standardized column of genotype/haplotype matrix (n x S) was multiplied by ( ) before fitting into LMM.
Implementation
Phenotypic variance, i.e., Var(Y) attributable to different components could be estimated by fitting GRMs corresponding to those components in LMM. We used REML implemented through GCTA(7, 8) and LDAK [9,16] to estimate of simulated and empirical phenotypes. For genotype-based analysis through conventional GCTA approach [7], we fitted a GRM generated from mothers’ genotypes (M) and children’s genotypes (F) separately in LMM whereas for haplotype-based analysis through H-GCTA approach, we fitted three GRMs (M1, M2 and P1) simultaneously in LMM. We also compared results from our approach with those from a contemporary approach, M-GCTA [21,24]. Analysis through the M-GCTA approach involved generation of the GRMs using mothers’ and children’s genotypes together. The upper left quadrant of the GRM represented genetic relationship matrix of mothers (M’); the lower right quadrant represented genetic relationship matrix of children (G) and sum of the lower left quadrant and its transpose represented the genetic relationship matrix of mothers and children (D).
Each approach was fitted through three different models, namely, GREML, LDAK-Thin (where, all pruned SNPs with r^2^ ≤ 0.98 were given equal weights, i.e., 1.0) and LDAK-Weights (where, specific weights were calculated for each pruned SNP based on its pair-wise LD with other SNPs in a 100 kb window) (S1 Fig). For GREML and LDAK-Thin model, constant values of were used ( ). The difference between GREML and LDAK-Thin model exists in the number of SNPs used to calculate GRM. While GREML uses all genotyped/imputed SNPs to calculate GRM, LDAK-Thin uses only pruned SNPs for the same. On the other hand, LDAK-Weights model uses SNP-specific weights along with specific values of α for scaling.
Simulation
A total of 100 replicates of phenotypes were simulated using empirical genotype data from 10,375 mother-child pairs. We randomly selected 10,000 causal variants from a common set of all polymorphic SNPs across all datasets (approximately 11 million markers) and randomly picked their effect sizes from standard normal distribution [N(0,1)]. Phenotypes were generated from the model , where , and are phenotypic, genetic and residual (environmental) values for individual j. Genetic value of individual j was calculated as where is standardized genotypic value and is effect size of variant i in individual j. Multiplication of randomly picked effect sizes [( ] with standardized genotype/haplotype matrix implies that effect sizes are inversely proportional to MAF. A total of 100 independently generated residual values were added to individual’s genetic value ( ) to simulate 100 replicates of phenotype. Residual effects were randomly drawn from a distribution where e is a vector of residual effects, I is an identity matrix and is the variance of residual effects with where is the variance of genetic values and is a preset SNP-based narrow-sense heritability ( = 0.5).
Three types of traits were simulated considering effects only from the mother (maternal traits), only from fetus (fetal traits) and joint maternal-fetal effects (Table 1). Traits with joint maternal-fetal effects were simulated with different levels of average correlation among maternal and fetal genetic effects (-0.5, -1.0, 0.5 and 1.0) (Table 1). First, traits with independent maternal-fetal genetic effects were simulated using independent and same set sets of causal variants in mothers and children. As we observed similar results in both scenarios, traits with correlated maternal-fetal genetic effects were simulated using same set of 10,000 causal variants in mothers and children. We also simulated traits with POEs, where m1 had less effect in comparison to p1. We considered different scenarios, where varying fractions of causal variants, e.g., 25%, 50%, showed maternal imprinting. In each scenario, we simulated different levels of imprinting for m1 (25% - 100%) by reducing effect sizes of m1 (75% - 0%) as compared to p1 (Table 1). Non-zero effects of m1 as compared to p1 represented partial maternal imprinting whereas no effect of m1 represented complete imprinting. All relatedness matrices using simulated data were generated and fitted using different models such as GREML, LDAK-Thin and LDAK-weights into LMM in a similar way as mentioned in the statistical method and Implementation section. We compared our haplotype-based approach (H-GCTA) with conventional GCTA approach and a contemporary M-GCTA approach using above mentioned models with two α values (-1.0, -0.25) for all simulated traits. All analyses for simulated data were run using unrestricted REML, i.e., estimates could be less than zero.
Analysis of empirical datasets
We performed analyses using three sets of markers – all polymorphic SNPs, SNPs with MAF > 0.001, SNPs with MAF > 0.01 and SNPs with MAF > 0.05, to include the contribution of very rare, rare, common and very common variants to the heritability of pregnancy-related outcomes (S1 Fig). The marker sets based on the MAF cutoff were selected in each dataset separately, considering mothers as founders. Then, a common set of markers across all datasets was selected in each MAF cutoff category. We pooled individual datasets and generated five different GRMs utilizing mothers’ genotypes (M), children’s genotypes (F), maternal transmitted haplotypes (M1), maternal non-transmitted haplotypes (M2) and paternal transmitted haplotypes (P1) using the imputed genotype data of mother/child pairs (S2 Table). One of the related individuals was removed from each GRM (relatedness coefficient > 0.05) and a common set of mother-child pairs across five GRMs was selected in each MAF cutoff category (S3 Table). The GRMs were created and fitted into LMM using GREML, LDAK-Thin and LDAK-Weights model. To avoid the problem of “non positive definite variance-covariance matrix” and “non-convergence of likelihood” particularly in models with multiple GRMs, LMM-based analyses for empirical data were performed using restricted REML, i.e., estimates could not be less than zero, except for the analyses with very common SNPs (MAF > 0.05). All the analyses were adjusted for principal components (PCs) – 20 PCs for analyses through GCTA and M-GCTA and 30 PCs (10 PCs corresponding to m1, m2 and p1 each) for analyses through H-GCTA (S6 Fig). We also replicated our findings in another Nordic dataset (HARVEST) of ~ 8,000 mother-child pairs (S1 Text). We estimated the of gestational duration through GREML (α = -1.0) in replication dataset using SNPs with MAF > 0.01 (S7 Fig and S27 Table).
Supporting information
S1 TextSupporting data.(DOCX)
S1 TableGenotype and Phenotype records in datasets.Number of pregnancies and genotypes present in individual datasets. a) Number of genotypes typed in each dataset b) number of genotypes passed through genotype QC; c) number of pregnancies after genotype QC and phenotype inclusion/exclusion.(DOCX)
S2 TableGenotype information in individual datasets.Number of imputed sites using Haplotype Reference Consortium (HRC), polymorphic SNPs in individual datasets and common set of SNPs across all available datasets. In individual datasets, mothers were considered as founders for each MAF cutoff category and corresponding children were selected. Final analysis was performed using pooled data and common set of SNPs across all datasets.(DOCX)
S3 TablePhenotype information in pooled dataset.Number of samples with gestational duration, birth weight, birth length and head circumference in the pooled data; a) without relatedness coefficient cut-off, b) with relatedness coefficient cut-off < 0.05. For mother-child pairs with relatedness coefficient cutoff < 0.05, common set of mother-child pairs were selected from GRMs based on mother’s genotypes, children’s genotypes, maternal transmitted alleles (m1), maternal non-transmitted alleles (m2) and paternal transmitted alleles (p1).(DOCX)
S4 TablePhenotype summary in individual datasets.Descriptive statistics of gestational duration, birth weight, birth length and head circumference in ALSPAC, HAPO, FIN, DNBC and MoBa. All four traits were available only in two datasets, namely ALSPAC and HAPO.(DOCX)
S5 TableSNP-based heritability of simulated maternal traits from ALSPAC dataset. of simulated maternal traits from ALSPAC dataset, estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S6 TableSNP-based heritability of simulated fetal traits from ALSPAC dataset. of simulated fetal traits from ALSPAC dataset, estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S7 TableSNP-based heritability of simulated traits from ALSPAC dataset with independent maternal-fetal genetic effects using independent sets of causal variants in mother and child. of simulated traits from ALSPAC dataset with independent maternal-fetal genetic effects (independent sets of causal variants in mother and child), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S8 TableSNP-based heritability of simulated traits from ALSPAC dataset with independent maternal-fetal genetic effects using same set of causal variants in mother and child. of simulated traits from ALSPAC dataset with independent maternal-fetal genetic effects (same set of causal variants in mother and child), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S9 TableSNP-based heritability of simulated traits from ALSPAC dataset with correlated maternal-fetal genetic effects (average correlation = -1.0). of simulated traits with correlated maternal-fetal genetic effects (average correlation = -1.0), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S10 TableSNP-based heritability of simulated traits from ALSPAC dataset with correlated maternal-fetal genetic effects (average correlation = -0.5). of simulated traits from ALSPAC dataset with correlated maternal-fetal genetic effects (average correlation = -0.5), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S11 TableSNP-based heritability of simulated traits from ALSPAC dataset with correlated maternal-fetal genetic effects (average correlation = 1.0). of simulated traits from ALSPAC dataset with correlated maternal-fetal genetic effects (average correlation = 1.0), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S12 TableSNP-based heritability of simulated traits from ALSPAC dataset with correlated maternal-fetal genetic effects (average correlation = 0.5). of simulated traits from ALSPAC dataset with correlated maternal-fetal genetic effects (average correlation = 0.5), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S13 TableSNP-based heritability of simulated maternal traits from pooled dataset. of simulated maternal traits from pooled dataset, estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S14 TableSNP-based heritability of simulated fetal traits from pooled dataset. of simulated fetal traits from pooled dataset, estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S15 TableSNP-based heritability of simulated traits from pooled dataset with independent maternal-fetal genetic effects using independent sets of causal variants in mother and child. of simulated traits from pooled dataset with independent maternal-fetal genetic effects (independent sets of causal variants in mother and child), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S16 TableSNP-based heritability of simulated traits from pooled dataset with independent maternal-fetal genetic effects using same set of causal variants in mother and child. of simulated traits from pooled dataset with independent maternal-fetal genetic effects (same set of causal variants in mother and child), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S17 TableSNP-based heritability of simulated traits from pooled dataset with correlated maternal-fetal genetic effects (average correlation = -1.0). of simulated traits from pooled dataset with correlated maternal-fetal genetic effects (average correlation = -1.0), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S18 TableSNP-based heritability of simulated traits from pooled dataset with correlated maternal-fetal genetic effects (average correlation = -0.5). of simulated traits from pooled dataset with correlated maternal-fetal genetic effects (average correlation = -0.5), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S19 TableSNP-based heritability of simulated traits from pooled dataset with correlated maternal-fetal genetic effects (average correlation = 1.0). of simulated traits from pooled dataset with correlated maternal-fetal genetic effects (average correlation = 1.0), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S20 TableSNP-based heritability of simulated traits from pooled dataset with correlated maternal-fetal genetic effects (average correlation = 0.5). of simulated traits from pooled dataset with correlated maternal-fetal genetic effects (average correlation = 0.5), estimated through conventional GCTA, M-GCTA and H-GCTA approach. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S21 TableSNP-based heritability of simulated fetal traits with parent-of-origin effects (POEs) from pooled dataset. of simulated fetal traits with POEs from pooled dataset, estimated through conventional GCTA, M-GCTA and H-GCTA approach using GREML (α = -1.0) model – A) 50% causal variants with POEs; B) 25% causal variants with POEs. POEs were incorporated by reducing the effect of m1 as compared to p1 by multiplying effects of m1 with (1 – I) where I is the imprinting factor such as 0.25, 0.50, 0.75 and 1.0. In each scenario, m1 shows either no imprinting, i.e., I = 0.0 ( ) or partial imprinting, i.e., I = 0.25-0.75 ( ) or complete imprinting, i.e., I = 1.0 ( ). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S22 TableSNP-based heritability of simulated traits from pooled dataset with correlated maternal-fetal genetic effects and parent-of-origin effects (POEs). of simulated traits from pooled dataset with correlated maternal-fetal genetic effects and POEs, estimated through conventional GCTA, M-GCTA and H-GCTA approach using GREML (α = -1.0) model. For simplicity, we assumed that all causal variants exhibit POEs. POEs were incorporated by reducing the effect of m1 as compared to p1 by multiplying effects of m1 with (1 – I) where I is the imprinting factor such as 0.25, 0.50, 0.75 and 1.0. m1 shows either no imprinting, i.e., I = 0.0 ( ) or partial imprinting, i.e., I = 0.25-0.75 ( ) or complete imprinting, i.e., I = 1.0 ( ). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of Pooled dataset. P-values were calculated using z test statistics (two sided).(DOCX)
S23 TableSNP-based heritability of gestational duration and fetal size measurements at birth using all polymorphic SNPs.Comparison of estimated through conventional GCTA, M-GCTA and H-GCTA approach for A) gestational duration, B) birth weight, C) birth length and D) head circumference. Each approach was fitted using GREML (α = -0.25), LDAK-Thin (α = -1.0) and LDAK-Weights (α = -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). Gestational duration was adjusted for fetal sex and fetal size measurements at birth were additionally adjusted for gestational duration up to third orthogonal polynomial. Analyses using GCTA and M-GCTA approach were adjusted for 20 PCs and H-GCTA approach was adjusted for 30 PCs (10 PCs corresponding to m1, m2 and p1 each). P-values were calculated using z test statistics (one sided).(DOCX)
S24 TableSNP-based heritability of gestational duration and fetal size measurements at birth using SNPs with MAF > 0.001.Comparison of estimated through conventional GCTA, M-GCTA and H-GCTA approach for A) gestational duration, B) birth weight, C) birth length and D) head circumference. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). Gestational duration was adjusted for fetal sex and fetal size measurements at birth were additionally adjusted for gestational duration up to third orthogonal polynomial. Analyses using GCTA and M-GCTA approach were adjusted for 20 PCs and H-GCTA approach was adjusted for 30 PCs (10 PCs corresponding to m1, m2 and p1 each). P-values were calculated using z test statistics (one sided).(DOCX)
S25 TableSNP-based heritability of gestational duration and fetal size measurements at birth using SNPs with MAF > 0.01.Comparison of estimated through conventional GCTA, M-GCTA and H-GCTA approach for A) gestational duration, B) birth weight, C) birth length and D) head circumference. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). Gestational duration was adjusted for fetal sex and fetal size measurements at birth were additionally adjusted for gestational duration up to third orthogonal polynomial. Analyses using GCTA and M-GCTA approach were adjusted for 20 PCs and H-GCTA approach was adjusted for 30 PCs (10 PCs corresponding to m1, m2 and p1 each). P-values were calculated using z test statistics (one sided).(DOCX)
S26 TableSNP-based heritability of gestational duration and fetal size measurements at birth using SNPs with MAF > 0.05.Comparison of estimated through conventional GCTA, M-GCTA and H-GCTA approach for A) gestational duration, B) birth weight, C) birth length and D) head circumference. Each approach was fitted using GREML (α = -0.25, -1.0), LDAK-Thin (α = -0.25, -1.0) and LDAK-Weights (α = -0.25, -1.0). For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). Gestational duration was adjusted for fetal sex and fetal size measurements at birth were additionally adjusted for gestational duration up to third orthogonal polynomial. Analyses using GCTA and M-GCTA approach were adjusted for 20 PCs and H-GCTA approach was adjusted for 30 PCs (10 PCs corresponding to m1, m2 and p1 each). P-values were calculated using z test statistics (two sided).(DOCX)
S27 TableReplication of heritability estimation of gestational duration. of gestational duration in HARVEST dataset based on SNPs with MAF > 0.01estimated through H-GCTA using GREML (α = -1.0) model. Gestational duration was adjusted for fetal sex. P-values were calculated using z test statistics (one sided).(DOCX)
S1 FigFramework of the study.Framework of the study depicting the traits under study, available datasets, MAF cutoffs, list of GRMs created in each MAF cutoff category, selection of unrelated mother-child pairs. Last block shows methods/models utilized for estimation and comparison of estimated from our approach (H-GCTA) with those obtained by two available approaches – GCTA and M-GCTA.(PDF)
S2 FigDistribution of available phenotypes in datasets.Distribution of available phenotypes in each dataset categorized by fetal sex – A) distribution of gestational duration, birth weight, birth length and head circumference in ALSPAC dataset; B) distribution of gestational duration, birth weight, birth length and head circumference in HAPO dataset; C) distribution of gestational duration, birth weight and birth length in FIN dataset; D) distribution of gestational duration and birth weight in DNBC dataset and E) distribution of gestational duration in MoBa dataset.(PDF)
S3 FigComparison of for simulated traits from ALSPAC dataset – maternal traits, fetal traits and traits with independent maternal-fetal genetic effects.Comparison of for simulated traits from ALSPAC dataset, estimated through different approaches fitting GREML (α = -1.0): A) maternal traits; B) fetal traits; C) traits where independent sets of causal variants have effects through mother and fetus; D) traits where same set of causal variants have effects through mother and fetus. For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided). * = (p value <5.0E-02), ** = (p value <1.0E-02), *** = (p value <1.0E-03) and **** = (p value <1.0E-04).(PDF)
S4 FigComparison of for simulated traits from ALSPAC dataset –traits with correlated maternal-fetal genetic effects.Comparison of estimated through different approaches fitting GREML (α = -1.0) for simulated traits with joint maternal–fetal effects from ALSPAC dataset: A) average correlation = -1.0; B) average correlation = -0.5; C) average correlation = 1.0; D) average correlation = 0.5. For GCTA, M is the GRM generated from maternal genotypes (m), and F is the GRM generated from fetal genotypes (f). For M-GCTA, M’ represents the genetic relationship matrix of mothers; G represents genetic relationship matrix of children and D represents mother-child covariance matrix. For H-GCTA, M1 is the GRM generated from maternal transmitted alleles (m1), M2 is the GRM generated from maternal non-transmitted alleles (m2), and P1 is the GRM generated from paternal transmitted alleles (p1). A total of 100 replicates of each phenotype were simulated using empirical genotypes of ALSPAC dataset. P-values were calculated using z test statistics (two sided). * = (p value <5.0E-02), ** = (p value <1.0E-02), *** = (p value <1.0E-03) and **** = (p value <1.0E-04).(PDF)
S5 FigSchematic representation of variance attributable to maternal transmitted, maternal non-transmitted and paternal transmitted haplotypes in H-GCTA.Schematic representation of variance attributable to maternal transmitted ( ), maternal non-transmitted ( ) and paternal transmitted ( ) haplotypes in H-GCTA – Z and W are the sets of causal variants with maternal and fetal effects, respectively. S1 (Black squares), S2 (orange circles) and S3 (purple triangles) are sets of causal variants with explicit maternal effects, joint-maternal-fetal effects and explicit fetal effects such that S1 ∈ Z, S2 = Z ∩ W and S3 ∈ W. and are causal effects through mother and fetus and and are reference allele frequencies of a causal variant in mother and fetus, respectively. Since each allele is a random draw from Bernoulli distribution, variance in terms of allele frequency is represented as and in mother and fetus, respectively. m1 affects the phenotype through maternal transmitted alleles in mother ( ) and maternal transmitted alleles in fetus ( ). Likewise, and represent the maternal non-transmitted and paternal transmitted alleles. Therefore, allelic effects - , (in the absence of POEs) and is the covariance of two binomial random variables m1’ and m1” present in mother and fetus, respectively; where, ρ and σ represent correlation and standard deviation of respective alleles. For a causal variant with joint maternal-fetal effect, in a random mating population, therefore, and total phenotypic variance explained by m1, m2 and p1 is .(PDF)
S6 FigPrincipal Components Analysis (PCA) plots using all polymorphic SNPs.Principal Components Analysis (PCA) plots of unrelated (relatedness coefficient < 0.5) mother-child pairs using all polymorphic SNPs. m: Mothers’ Genotypes; f: children’s genotypes; m1: maternal transmitted alleles; m2: maternal non-transmitted alleles; and p1: paternal transmitted alleles. Unlike PCA using pooled data, 20 PCs were created using independent SNPs from a merged dataset which included SNPs from 1000 genome samples (phase 3) and pooled dataset. Since 1000 genome dataset in general lacks parent-child information, we used the first allele of phased 1000 genome data along with m1 or p1 to create 20 PCs whereas second allele of phased 1000 genome data was used along with m2 to create 20 PCs.(PDF)
S7 FigReplication of heritability estimation of gestational duration. estimation of fetal sex adjusted gestational duration in HARVEST dataset using our approach (H-GCTA). Estimated of fetal sex adjusted gestational duration in pooled dataset is pasted for comparison (image on the right). Analysis was performed through GREML (α = -1.0) using SNPs with MAF > 0.01.(PDF)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Visscher PM, Hill WG, Wray NR. Heritability in the genomics era--concepts and misconceptions. Nat Rev Genet. 2008;9(4):255–66. doi: 10.1038/nrg 2322 18319743 · doi ↗ · pubmed ↗
- 2Zhang Z, Ersoz E, Lai C-Q, Todhunter RJ, Tiwari HK, Gore MA, et al. Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 2010;42(4):355–60. doi: 10.1038/ng.546 20208535 PMC 2931336 · doi ↗ · pubmed ↗
- 3Cnaan A, Laird NM, Slasor P. Using the general linear mixed model to analyse unbalanced repeated measures and longitudinal data. Stat Med. 1997;16(20):2349–80. doi: 10.1002/(sici)1097-0258(19971030)16:20<2349::aid-sim 667>3.0.co;2-e 9351170 · doi ↗ · pubmed ↗
- 4Henderson CR. Applications of Linear Models in Animal Breeding. University of Guelph; 1984. p. 462.
- 5Vinkhuyzen AAE, Wray NR, Yang J, Goddard ME, Visscher PM. Estimation and partition of heritability in human populations using whole-genome analysis methods. Annu Rev Genet. 2013;47:75–95. doi: 10.1146/annurev-genet-111212-133258 23988118 PMC 4037293 · doi ↗ · pubmed ↗
- 6Srivastava AK, Williams SM, Zhang G. Heritability estimation approaches utilizing genome-wide data. Curr Protoc. 2023;3(4):e 734. doi: 10.1002/cpz 1.734 37068172 PMC 10923601 · doi ↗ · pubmed ↗
- 7Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet. 2011;88(1):76–82. doi: 10.1016/j.ajhg.2010.11.011 21167468 PMC 3014363 · doi ↗ · pubmed ↗
- 8Yang J, Benyamin B, Mc Evoy BP, Gordon S, Henders AK, Nyholt DR, et al. Common SN Ps explain a large proportion of the heritability for human height. Nat Genet. 2010;42(7):565–9. doi: 10.1038/ng.608 20562875 PMC 3232052 · doi ↗ · pubmed ↗
