Genetic Interactions Explain Variance in Cingulate Amyloid Burden: An AV-45 PET Genome-Wide Association and Interaction Study in the ADNI Cohort
Jin Li, Qiushi Zhang, Feng Chen, Jingwen Yan, Sungeun Kim, Lei Wang, Weixing Feng, Andrew J. Saykin, Hong Liang, Li Shen

TL;DR
This study explores how gene interactions influence amyloid buildup in the brain, a key sign of Alzheimer's disease, using brain imaging and genetic data from patients.
Contribution
The study introduces a novel approach combining genome-wide association and interaction analyses to uncover gene interactions linked to amyloid burden in Alzheimer's.
Findings
Genetic main effects near APOE, APOC1, and TOMM40 were confirmed as significant.
Eight novel SNP-SNP interactions were identified that could explain amyloid burden variability.
The findings suggest that gene interactions may help explain missing heritability in Alzheimer's.
Abstract
Alzheimer's disease (AD) is the most common neurodegenerative disorder. Using discrete disease status as the phenotype and computing statistics at the single marker level may not be able to address the underlying biological interactions that contribute to disease mechanism and may contribute to the issue of “missing heritability.” We performed a genome-wide association study (GWAS) and a genome-wide interaction study (GWIS) of an amyloid imaging phenotype, using the data from Alzheimer's Disease Neuroimaging Initiative. We investigated the genetic main effects and interaction effects on cingulate amyloid-beta (Aβ) load in an effort to better understand the genetic etiology of Aβ deposition that is a widely studied AD biomarker. PLINK was used in the single marker GWAS, and INTERSNP was used to perform the two-marker GWIS, focusing only on SNPs with p ≤ 0.01 for the GWAS analysis. Age,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNuclear Receptors and Signaling · Insect Resistance and Genetics · Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
1. Introduction
Alzheimer's disease (AD) is the most common neurodegenerative disorder characterized by a progressive decline in memory and cognition. The pathologic cascade in AD involves two primary hallmarks: amyloid-β (Aβ) plaques and neurofibrillary tangles [1]. Genetics plays an important role in late-onset Alzheimer's disease (LOAD), but missing heritability remains to be found according to current approximations [2]. The last several decades of research yielded only one genetic risk factor of large effect for LOAD: Apolipoprotein E (APOE) with 2 copies of the ε4 allele confers approximately 6- to 30-fold risk for the disease [3]. Some recent genome-wide association studies (GWAS) have identified several additional AD susceptibility genes, including* BIN1*,* CLU*,* ABCA7*,* CR1*,* PICALM*,* MS4A6A*,* MS4A4E*,* CD33*,* CD2AP*, and* EPHA1* [4–9]. However, these genetic factors have relatively low effect sizes (odds ratios of 0.87–1.23) and cumulatively account for approximately 35% of population-attributable risk [8]. More recently, a large scale GWAS meta-analysis identified 11 new susceptibility loci with also small effect sizes [10].
Traditional GWAS analyses used discrete disease status as the phenotypic trait of interest despite the fact that LOAD is a clinically heterogeneous disorder. Recently, researchers started to explore intermediate quantitative traits (QTs), such as clinical or cognitive features, biochemical assays, or neuroimaging biomarkers, in genetic association testing. This may have the potential to address the issue of clinical heterogeneity in LOAD. These QTs are often measured as continuous variables and thus exhibit a higher genetic signal-to-noise ratio. Further, most intermediate QTs are more proximal to their genetic bases than disease status. Thus, the incorporation of intermediate QTs can potentially increase statistical power to detect disease-related genetic associations [11, 12]. An ancillary benefit of using QTs is that they can serve as effective biomarkers for monitoring disease progress or treatment response in clinical practice or drug trials.
Over the past 10–15 years, studies have identified robust and predictive biomarkers for AD including levels of tau and amyloid-β peptides in cerebrospinal fluid (CSF), selective measures of brain atrophy using magnetic resonance imaging (MRI), and imaging of glucose hypometabolism and amyloid using positron emission tomography (PET) [13]. PET imaging can be used to quantify levels of amyloid in the brain by utilizing a radiotracer such as florbetapir (^18^F-AV-45 or AV-45) or/and Pittsburgh compound-B (PiB, N-methyl-[^11^C]^2^-(40-methylaminophenyl)-6-hydroxybenzothiazole). These amyloid measures have been studied as biomarkers for classifying AD [14–17]. All these multimodal biomarkers can potentially be served as AD relevant QTs and have been examined in many existing quantitative genetics studies of LOAD [18].
In addition, most genetic association studies compute statistics at the single marker level and ignore the possible underlying biological interactions that contribute to the development of disease [19] and could be a possible source for “missing heritability.” Given the quadratically growing search space of two-way interactions, we are facing major computational and statistical challenges. To address this issue, one approach is to effectively explore epistatic interactions in genome-wide data by using a priori statistical and/or biological evidence to generate a reduced set of genetic markers for interaction testing. Using this strategy, previous interaction studies in LOAD (e.g., [20–24]) implicated interactions between* CR1* and* APOE* using quantified Aβ PET as the outcome variable [24] and between cholesterol trafficking genes [21, 22] and tau phosphorylation genes [20] in case-control analyses. These studies demonstrated that the important information could be garnered from investigating genetic interactions in complex diseases like LOAD.
With these observations, in the present work, we conducted a quantitative genetics study of an AD-associated amyloid imaging phenotype and examined both single marker main effects and two-marker interaction effects at the genome-wide level. Specifically, we investigated the main and interaction effects of genome-wide markers on cingulate amyloid-beta (Aβ) load in an effort to better understand the genetic etiology of cingulate cortical Aβ deposition (a LOAD biomarker).
2. Materials and Methods
Data used in the preparation of this paper were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/). The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies, and nonprofit organizations, as a $60 million, 5-year public-private partnership. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer's disease (AD). Determination of sensitive and specific markers of very early AD progression is intended to aid researchers and clinicians to develop new treatments and monitor their effectiveness, as well as lessen the time and cost of clinical trials.
The Principal Investigator of this initiative is Michael W. Weiner, M.D., VA Medical Center and University of California, San Francisco. ADNI is the result of efforts of many coinvestigators from a broad range of academic institutions and private corporations, and subjects have been recruited from over 50 sites across the US and Canada. The initial goal of ADNI was to recruit 800 subjects but ADNI has been followed by ADNI-GO and ADNI-2. To date these three protocols have recruited over 1500 adults, aged 55 to 90, to participate in the research, consisting of cognitively normal older individuals, people with early or late MCI, and people with early AD. The follow-up duration of each group is specified in the protocols for ADNI-1, ADNI-2, and ADNI-GO. Subjects originally recruited for ADNI-1 and ADNI-GO had the option to be followed in ADNI-2. For up-to-date information, see http://www.adni-info.org/.
We applied for and were granted permission to use data from the ADNI cohort (http://www.adni-info.org/) to conduct genetic association and interaction analyses.
2.1. Subjects and Data
For the present work, analyses were restricted to subjects with both genotyping data and AV-45 PET data available. The study sample (N = 602) included 190 healthy control (HC), 215 early MCI (EMCI), 152 late MCI (LMCI), and 45 AD subjects. Table 1 shows selected demographic and clinical characteristics of these participants at the time of the baseline AV-45 PET scan.
2.2. Genotyping Data and Quality Control
The genotyping data of the participants were collected using either the Illumina 2.5 M array (a byproduct of the ADNI whole genome sequencing sample) or the Illumina OmniQuad array [18, 25, 26]. For the present analyses, we included single nucleotide polymorphism (SNP) markers that were present on both arrays.
Quality control (QC) was performed using the PLINK software (version 1.07) [27]. SNPs not meeting any of the following criteria were excluded from further analyses: (1) call rate per SNP ≥95%; (2) minor allele frequency ≥ 5% (n = 117, 175 SNPs were excluded based on criteria 1 and 2); and (3) Hardy-Weinberg equilibrium test of p ≥ 10^−6^ (n = 997 SNPs were excluded) using control subjects only. Participants were excluded from the analysis if any of the following criteria were not satisfied: (1) call rate per participant ≥ 90% (3 participants were excluded); (2) sex check (1 participant was excluded); and (3) identity check for related pairs (3 sibling pairs were identified with PI_HAT >0.5; one participant of each pair was randomly selected and excluded from the study).
Population stratification analysis was performed using EIGENSTRAT [28] and confirmed using STRUCTURE [29]. It yielded 47 study participants who did not cluster with the remaining subjects and with the CEU HapMap samples who are primarily of European ancestry (non-Hispanic Caucasians). These 47 participants were excluded from the analysis. After QC, 582,718 SNPs and 602 samples remained available for genetic association and interaction analyses.
2.3. Quantitative Traits
A previous AV-45 PET study [30] showed that both AD and amnestic MCI subjects had higher standardized uptake value ratio (SUVR) in global cortical, precuneus, frontal, occipital, and posterior cingulate areas. We focused this study in one of these regions, which is cingulate. UC Berkeley extracted baseline SUVR mean measure from the cingulate cortical region (version 2014.7.30) that was downloaded from the ADNI database (http://adni.loni.usc.edu/) for 987 ADNI-GO/2 participants. We also downloaded the cerebellum SUVR measure and used it to normalize the cingulate SUVR measure. The normalized SUVR was used as the quantitative trait (QT) in our analyses. After excluding 383 participants due to the lack of genotyping data, 602 individuals remained in the further analysis.
In addition, amyloid-β 1-42 peptide (Aβ-42), total tau (t-tau), and tau phosphorylated at the threonine 181 (p-tau181p), measured in CSF samples, are potential diagnostic biomarkers for AD [31–33]. Among the 602 individuals, 504 have both AV-45 data and CSF data. Following a previous GWAS study on CSF biomarkers [34], QC was performed on the CSF data to reduce the potential influence of extreme outliers on statistical results. Mean and standard deviation (SD) of Aß1-42 and 2 ratios (t-tau/Aß1-42 and p-tau181p/Aß1-42) were calculated, blind to diagnostic information. Subjects who had at least one value greater or smaller than 4 SDs from the mean value of each of 3 CSF variables were regarded as extreme outliers and removed from the analysis. This step removed 5 additional participants, resulting in 499 valid CSF samples.
2.4. Genetic Association Studies: Main Effects and Interaction Effects
For GWAS examining the main effects, linear regression was performed using PLINK to determine the association of each SNP to the AV-45 measure. An additive genetic model was tested with covariates including age, gender, and diagnosis (through four binary dummy variables indicating HC, EMCI, LMCI, or AD). Manhattan plots and Q-Q plots were generated using Haploview (http://www.broad.mit.edu/mpg/haploview/) and R (http://www.r-project.org/), respectively.
For GWIS examining the interaction effects, the INTERSNP software [35] was applied to the genotyping data and phenotypic AV-45 measure. First, a single marker p value for the main effect was computed for each SNP. Top 10,000 SNPs with the smallest p values were selected and included in the subsequent interaction analysis. An explicit test for additive interaction (the full model including both additive and dominance effects plus interaction term versus reduced model that does not contain interaction terms) was performed on all possible SNP pairs among the top 10,000 SNPs, using two-marker analysis. The computation was conducted in a linear regression framework. We examined the association between SNP-SNP interactions and the AV-45 measure while controlling for relevant covariates at the baseline scan, including age, sex, and clinical diagnosis. This resulted in a total of approximately 50 million unique SNP pairs to be tested from the ADNI dataset. Interactions were considered significant if their Bonferroni corrected p value < 0.05.
2.5. Post Hoc Analysis
For identified significant interactions, we applied hierarchical linear regression using IBM SPSS 20 to estimate the amount of variance (R ^2^) in the AV-45 measure accounted for by these interaction terms. We first included the same set of covariates (age, gender, and diagnosis) in the linear model. After that, we included* APOE* status, the closest SNP to the* BCHE* SNP identified in a prior amyloid GWAS study [36], and the two SNP main effects from the identified SNP pair. Finally, we included the SNP-SNP interaction term to calculate additional variance explained by the interaction term. The difference in R ^2^ for the significant models was calculated in SPSS as ΔR ^2^ = R ^2^ (full model with interaction term) − R ^2^ (reduced model without interaction term). Significant effects were plotted in SPSS as well.
In addition, based on the identified interactions associated with AV-45, we further evaluated their main and interaction effects on the CSF levels related to amyloid, including Aβ1-42, t-tau181p/Aβ1-42, and p-tau/Aβ1-42. These three CSF measures were used as the QTs in 3 separate genetic analyses, following the same method and steps for analyzing AV-45 phenotype as described above.
3. Results and Discussion
3.1. GWAS Results
Table 1 shows selected demographic and clinical characteristics of 602 ADNI participants analyzed in this study, where the EMCI group is slightly younger than the other groups. Figure 1 shows the Q-Q plot, indicating no evidence of spurious inflation. Figure 2 shows the Manhattan plot. As expected, significant associations were identified between loci on chromosome 19 and the AV-45 measure. The top association is from rs4420638 (P = 5.11 × 10^−21^), which codes for the* APOC1* [37]. A few other SNPs within the* APOE* region, including adjacent* APOC1* and* TOMM40*, were significantly associated with the AV-45 level in cingulate.
3.2. SNP-SNP Interaction Results
The INTERSNP model we tested included age, sex, and diagnosis as covariates. Eight SNP pairs showed significant interaction effects on the cingulate AV45 measure (corrected p value < 0.05) (Table 2): rs2194938 (CLSTN2)-rs7644138 (FHIT), rs7916162 (TACC2)-rs2326536 (PRNP ^∗^), rs2295873 (TACC2)-rs7794838 (IGFBP3^∗^), rs2295874 (TACC2)-rs2326536 (PRNP ^∗^), rs13056151 (BCR)-rs17594541 (MAGI2), rs13426621 (LOC388942)-rs7037332 (TYRP1^∗^), rs16936424 (LOC387761)-rs10504164 (N/A), and rs16939265 (HNF4G ^∗^)-rs6854047 (RWDD4^∗^).
3.3. Post Hoc Analysis
Table 2 also shows the results of post hoc analysis on cingulate amyloid deposition. Age, gender, and diagnosis were first included in the model and accounted for 11% of variance in the amyloid QT.* APOE* status was then accounted for an additional 16.1% of variance, followed by the closest SNP to the* BCHE* SNP identified in [36] accounted for an additional 1.8% of variance. For each interaction, we ran a hierarchical linear regression model. We first added in the genetic main effects and then the genetic interaction term to determine the variance associated with the interaction term alone. For rs2194938 (CLSTN2)-rs7644138 (FHIT), the SNP main effects accounted for 3.4% of variance, and the interaction term accounted for 4.9% of variance (8.3% combined). For rs7916162 (TACC2)-rs2326536 (PRNP ^∗^), the main effects accounted for 2% of variance, and the interaction accounted for 4.9% of variance (6.9% combined). For rs2295873 (TACC2)-rs7794838 (IGFBP3^∗^), the main effects accounted for 3.7% of variance, and the interaction term accounted for 4.1% of variance (7.8% combined). For rs2295874 (TACC2)-rs2326536 (PRNP ^∗^), the SNP main effects accounted for 3.7% of variance, and the interaction term accounted for 4.1% of variance (7.8% combined). For rs13056151 (BCR)-rs17594541 (MAGI2), the main effects accounted for 3.5% of variance, and the interaction term accounted for 2.6% of variance (6.1% combined). For rs13426621 (LOC388942)-rs7037332 (TYRP1^∗^), the main effects accounted for 4.2% of variance, and the interaction accounted for 2.3% of variance (6.5% combined). For rs16936424 (LOC387761)-rs10504164 (N/A), the main effects accounted for 3.7% of variance, and the interaction term accounted for 1.7% of variance (5.4% combined). For rs16939265 (HNF4G^∗^)-rs6854047 (RWDD4^∗^), the main effects accounted for 2.7% of variance, and the interaction term accounted for 1.3% of variance (4.0% combined).
Using a slightly reduced sample (N = 499) with CSF biomarker data available, all 8 identified interactions remained statistically significant when performing hierarchical linear regression using the CSF phenotypes (one baseline measure: Aβ, two ratios: t-Tau/Aβ and p-Tau/Aβ) instead of the AV-45 measure as outlined earlier (Table 3). We also repeated the same AV-45 analysis on the reduced sample and achieved a very similar result (Table 4).
3.4. Discussion
In this study, we performed both GWAS and GWIS analyses of the cingulate AV-45 florbetapir PET measure, using a sample of 602 subjects from the ADNI database. To our knowledge, this is the first genome-wide study on examining SNP-SNP interaction effects on cingulate amyloid deposition in a substantially large sample. In the single marker analysis, as expected, SNPs in* APOE*,* APOC1,* and* TOMM40* genes (Figure 2) exhibited genome-wide significant associations to the cingulate cortical Aß level. Two-marker interaction analyses revealed 8 SNP pairs, which had significant genetic interactions (corrected p ≤ 0.05) with cingulate amyloid burden. The risk variants at these pairs had low main effects but explained a relatively high-level variance of the amyloid deposition in cingulate (Table 2).
In addition, missing heritability can partially be explained by the interaction effects that are not examined in traditional GWAS analyses. Genetic risk underlying diagnosis of LOAD is considered to be manifested from multiple genes which interact with each other. We have performed a post hoc analysis investigating the effects of the identified SNP-SNP interactions LOAD related quantitative phenotypes including amyloid deposition and CSF biomarkers (Aβ, t-tau/Aβ, p-tau/Aβ). Given amyloid and tau phosphorylation as major AD hallmarks, it is not surprising to observe the genetic interaction effects on both the amyloid load and relevant CSF biomarkers (Tables 2–4). Our results suggest that significant SNP-SNP interactions could exist between SNPs with low and insignificant main effects, and these interactions could be associated with altered amyloid burden and explain high-level risk in AD.
In line with our hypothesis, we identified multiple significant genetic interactions associated with cingulate amyloid deposition. Several genes found in this study have already been implicated in AD, thus lending confidence to the analytic procedure and results. These genes include* PRNP* [38, 39],* IGFBP3* [40, 41], and* MAGI2* [42, 43]. For example, Guerreiro et al. reported a nonsense mutation in* PRNP* associated with clinical Alzheimer's disease [38]. Ikonen et al. showed that interaction between the Alzheimer's survival peptide humanin and insulin-like growth factor-binding protein 3 (IGFBP3) regulates cell survival and apoptosis [40]. Potkin et al. identified an* MAGI2* SNP associated with hippocampal atrophy using the ADNI data [42]. Perhaps more importantly, this study also identified a number of SNPs that had not yet been associated with AD in conventional GWAS studies. Thus, this study exposes several potential candidate genes that could be explored in future replication samples.
This study had several methodological and technical advantages over other imaging genetics studies in addition to the above interesting findings. (1) To our knowledge this is the first genome-wide study to explore how SNP-SNP interactions influence cingulate amyloid burden, measured using florbetapir PET scan information. (2) Using continuous quantitative traits as phenotypes confers higher statistical power than using conventional clinical status. (3) The sample in this study included HC, EMCI, LMCI, and AD, thus providing a continuous and wide spectrum of the disease progression in the dataset. (4) Our approach embraced, rather than ignored, the confounding factors including age, sex, diagnosis, and previously identified risk genes* APOE* and* BCHE* and provided more accurate estimate of the interaction effects on amyloid burden. (5) CSF data were used in this study to cross-check the identified interactions, which had the potential to serve as an indirect validation strategy or provide complemental information.
Our study has several limitations. (1) We used single marker main effect value to select SNPs for interaction analysis, which could miss significant interactions between SNPs with insignificant main effects. (2) The small cell size in the interaction analyses might introduce false positives. (3) Our approach is mostly data-driven, without utilizing any existing biological knowledge (e.g., pathways, networks, and other functional annotation data), which may reduce the statistical power and result interpretability.
4. Conclusions
We performed GWAS and GWIS using amyloid imaging as the quantitative phenotype and investigated the genetic interaction effects on cingulate amyloid-beta (Aβ) load. The single marker analyses revealed significant hits within or proximal to* APOE*,* APOC1,* and* TOMM40* genes, which were previously implicated in AD. The interaction analyses yielded a few novel interaction findings associated with cingulate amyloid burden, such as those between* CLSTN2* and* FHIT*, between* TACC2* and* PRNP*, between* TACC2* and* IGFBP3*, and between* BCR* and* MAGI2*. Each of these SNP pairs demonstrated significant interaction effects while their individual main effects were not prominent. This suggests that searching for interaction effects may help solve the problem of missing heritability to some extent. Future studies should attempt to replicate these results in independent datasets with neuroimaging and genetic data, as they become available. Additional pathway analysis and gene sets enrichment analysis could be performed to help understand the genetic interactions between SNPs on amyloid imaging phenotypes and potentially provide critical functional evidence in support of the statistical association findings.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Price D. L.Sisodia S. S.Mutant genes in familial Alzheimer's disease and transgenic models Annual Review of Neuroscience 199821147950510.1146/annurev.neuro.21.1.4799530504 · doi ↗ · pubmed ↗
- 2Bertram L.Lill C. M.Tanzi R. E.The genetics of alzheimer disease: back to the future Neuron 201068227028110.1016/j.neuron.2010.10.0132-s 2.0-7795792786520955934 · doi ↗ · pubmed ↗
- 3Akiyama H.Ikeda K.Kondo H.Kato M.Mc Geer P. L.Microglia express the type 2 plasminogen activator inhibitor in the brain of control subjects and patients with Alzheimer's disease Neuroscience Letters 19931641-223323510.1016/0304-3940(93)90899-V 2-s 2.0-00277609368152607 · doi ↗ · pubmed ↗
- 4Harold D.Abraham R.Hollingworth P.Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's disease Nature Genetics 2009411088109310.1038/ng.44019734902 PMC 2845877 · doi ↗ · pubmed ↗
- 5Belbin O.Carrasquillo M. M.Crump M.Investigation of 15 of the top candidate genes for late-onset Alzheimer's disease Human Genetics 2011129327328210.1007/s 00439-010-0924-22-s 2.0-7995375276321132329 PMC 3036835 · doi ↗ · pubmed ↗
- 6Carrasquillo M. M.Hunter T. A.Ma L.Replication of BIN 1 association with Alzheimer's disease and evaluation of genetic interactions Journal of Alzheimer's Disease 201124475175810.3233/jad-2011-1019322-s 2.0-79959219655 PMC 348917021321396 · doi ↗ · pubmed ↗
- 7Hollingworth P.Harold D.Sims R.Common variants at ABCA 7, MS 4A 6A/MS 4A 4E, EPHA 1, CD 33 and CD 2AP are associated with Alzheimer's disease Nature Genetics 201143542943610.1038/ng.8032-s 2.0-7995548441421460840 PMC 3084173 · doi ↗ · pubmed ↗
- 8Naj A. C.Jun G.Beecham G. W.Common variants at MS 4A 4/MS 4A 6E, CD 2AP, CD 33 and EPHA 1 are associated with late-onset Alzheimer's disease Nature Genetics 201143543644110.1038/ng.8012-s 2.0-7995546491121460841 PMC 3090745 · doi ↗ · pubmed ↗
