Discovery of genetic susceptibility variants in pediatric and adult ependymoma
Joshua D Strauss, Priya B Shetty, Spiridon Tsavachidis, Jinyoung Byun, Stephen C Mack, Xiao Xiangjun, Terri S Armstrong, Mark R Gilbert, Lisa Mirabello, Smita Bhatia, Wendy M Leisenring, Lindsay M Morton, Gregory T Armstrong, Jon Foss-Skiftesvik, Christian Munch Hagen

TL;DR
This study identifies genetic variants linked to ependymoma, a rare brain tumor, in both children and adults.
Contribution
The study reports novel genome-wide significant intronic and intergenic variants associated with pediatric and adult ependymoma.
Findings
A significant intronic variant in EDIL3 and a nearly significant variant in LHX4 were found in pediatric whole-genome sequencing data.
Two significant intronic variants in FAM149A and CYS1 were identified in genotyped pediatric data.
A highly significant intergenic variant near C1orf94 and a significant variant in KCNQ3 were found in adult subjects.
Abstract
Ependymoma is a malignancy of the neuroepithelium-derived ependyma that lines the spinal cord and ventricles of the brain, occurring most frequently in young children and older adults. Genetic susceptibility to ependymoma has proven difficult to assess due to disease rarity. We performed genome-wide association studies (GWAS) of 478 ependymoma patients and 4,841 disease-free controls of European ancestry. Ependymoma patients consisted of 117 children (<18 years old) with whole-genome sequencing (WGS), 142 children with genotyping, and 219 adults (≥18 years old) with genotyping. Genotyped samples were imputed using the 1,000 Genomes Project as the reference panel and underwent quality control filtering. The GWAS was performed separately by age group and technology (genotyped or WGS). GWAS variants were considered significant at P < 5 × 10−8. Among pediatric subjects with WGS data, we…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3| Age group | Genomic analysis | Case/control | Cohort/study | Subjects pre-QC | Subjects post-QC | Sex-male post-QC |
|---|---|---|---|---|---|---|
|
| Whole genome sequencing | Case | CBTN | 48 | 48 | 56% |
| St. Jude | 73 | 69 | ||||
| Control | St. Jude | 298 | 285 | 48% | ||
| Genotyping | Case | ACCESS | 60 | 60 | 64% | |
| CCSS | 55 | 55 | ||||
| GICC | 8 | 8 | ||||
| NCI-Connect | 9 | 8 | ||||
| TOPNOC | 12 | 11 | ||||
| Control | ADD Health | 1340 | 1331 | 47% | ||
|
| Genotyping | Case | ACCESS | 6 | 6 | 40% |
| GICC | 77 | 73 | ||||
| NCI-Connect | 157 | 140 | ||||
| Control | GICC | 3249 | 3225 | 57% |
| Analytic set | CHR | BP | rsID | A1 | A2 | Case genotypes | Control genotypes | Case AF | Control AF | Cases | Controls | Score | P-value | Band | Feature | Gene |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Pediatric with genotyping | 1 | 34299373 | rs1404350 | C | A | 7-21-114 | 5-153-1173 | 0.123 | 0.061 | 142 | 1331 | -11.614 | 1.23E-14 | 1p35.1 | intergenic | C1orf94 (distance = 80243 BP) |
| Pediatric with genotyping | 2 | 10070302 | rs61052588 | C | T | 16-57-69 | 75-516-740 | 0.313 | 0.250 | 142 | 1331 | -9.072 | 3.02E-08 | 2p25.1 | intronic | CYS1 |
| Pediatric with genotyping | 4 | 186171576 | rs6852180 | G | A | 1-18-123 | 5-133-1193 | 0.070 | 0.054 | 142 | 1331 | -4.844 | 1.81E-08 | 4q35.1 | intronic | FAM149A |
| Pediatric with WGS | 5 | 84182729 | rs149378 | C | T | 34-57-26 | 44-146-95 | 0.534 | 0.411 | 117 | 285 | -19.705 | 1.88E-08 | 5q14.3 | intronic | EDIL3 |
| Adult with genotyping | 8 | 132455457 | rs79089725 | T | A | 3-27-189 | 8-362-2855 | 0.075 | 0.059 | 219 | 3225 | -15.642 | 2.04E-08 | 8q24.22 | intronic | KCNQ3 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGlioma Diagnosis and Treatment · Microtubule and mitosis dynamics · Epilepsy research and treatment
Ependymoma is a rare malignancy of the central nervous system (CNS) arising from the neuroepithelium-derived ependyma that lines the spinal cord and ventricles of the brain, with an annual incidence of approximately two cases per million individuals in the United States. Under normal conditions, the glial cell-derived ependyma regulates cerebrospinal fluid and supports neurogenesis.1^,^2 Ependymomas can occur at any age, but incidence primarily follows a biphasic distribution with peaks in children under 10 years old and adults in their 50s.3^,^4 Disease location typically differs by age, with ∼90% of pediatric cases occurring in the brain, whereas two-thirds of adult cases occur in the spinal cord.5
Germline variation related to ependymoma has not been thoroughly investigated. Previous studies have identified ependymoma susceptibility alterations in APC, NF1, NF2, LTZR1, PMS2, and TP53.6–8 However, these investigations were limited by the number of ependymoma patients included, absence of disease-free controls, and were limited to known cancer-related genes. Genome-wide association studies (GWAS) of ependymoma may identify novel susceptibility loci that provide insight into tumor development. Considering the approximately 40-year difference between the most common ages of disease occurrence, we posit that there are several shared and distinct germline variants by age of diagnosis. The genetic assessment of ependymoma susceptibility is critical to advance our comprehension of disease etiology and early detection techniques.
In this study, we performed GWAS of adult and pediatric ependymoma patients to identify novel disease-associated common germline variants.
Methods
Data Sources
Ependymoma patients and controls with genotyping or whole genome sequencing (WGS) data were identified from several consortia/studies. A description of the studies and technologies used is available in the Supplementary Material (S1).
Ethics
Each original study collected participants’ clinical information and samples for DNA extraction after obtaining written informed consent on protocols individually approved by each study’s ethics review board. The current GWAS analysis was approved by the Baylor College of Medicine (Houston, TX) Institutional Review Board (IRB).
Whole-Genome Sequencing Calling
WGS samples aligned to Genome Reference Consortium Human Build 38 (GRCh38) were processed according to the GATK v4.5.0 germline short variant discovery best practices workflow.9 All sample files were jointly called to enhance the sensitivity and accuracy of variant detection. Called samples were then hard filtered with the following GATK suggested thresholds: quality score > 30, quality normalized by depth > 2, mapping quality > 40, mapping quality rank sum > −12.5, read position rank sum > −8, symmetric odds ratio < 3.0, Fisher strand bias < 60. Samples with a call rate < 95% were removed.
Genotyped Sample Imputation
Genotyped samples were imputed to improve variant overlap between multiple study methodologies and genomic platforms. First, samples with a call rate < 95%, structural variants, and multi-allelic variants were removed from further analyses. Samples were all lifted over to GRCh38 using CrossMap v0.7.0 and aligned to the 30X high-coverage 1000 Genomes Project (1KG) GRCh38 reference panel of 2,504 unrelated samples.10^,^11 Phasing and imputation were performed using BEAGLE 5.4 with 1KG as the reference panel.12^,^13 Imputed single-nucleotide polymorphisms (SNP) were filtered by imputation quality score (dosage-R^2^ > 0.4).
Quality Control
Samples were merged into three analytic sets based on age group (pediatric or adult) and technology (genotyping or WGS). First-degree related sample pairs were identified using Kinship-based INference for Gwas (KING) in Plink2, with one sample of each pair being randomly removed from further analyses, if present.14^,^15 SNPs with call rates <98%, minor allele frequency (MAF) < 5%, or a significant departure from Hardy-Weinberg equilibrium (P < 1 × 10^−20^) in controls were excluded. The ancestry of all samples was genomically inferred into one of five distinct super-populations (Admixed American, African, East Asian, European, South Asian) using KING v2.3.2 with the known ancestries from 1KG samples. Data transformation and quality control measures were performed with BCFtools and Plink2.16
Study Population
Ependymoma patients met diagnostic coding requirements according to the International Classification of Childhood Cancer, third edition (ICCC-3; code IIIa) or the International Classification of Diseases for Oncology, third edition (ICD-O-3; histology codes 9383 and 9391-9394) for pediatric and adult patients, respectively.17^,^18 A total of 478 patients diagnosed with ependymoma and 4,841 disease-free controls of genomically inferred European ancestry with germline genotyped or sequencing data passed the quality control methods described above. Of the 478 patients, 259 were children (<18 years old) and 219 were adults (≥18 years old). Disease-free controls had no record of CNS malignancies in their respective studies. Cases and controls were assigned into three sets, consisting of pediatric subjects with WGS (117 cases, 285 controls), pediatric subjects with genotyping (142 cases, 1331 controls), and adult subjects with genotyping (219 cases, 3225 controls; Table 1). The multi-ancestral GWAS of the pediatric subjects with genotyping and pediatric subjects with WGS is available in the Supplementary Materials (S2-S4) and Supplementary Tables.
Statistical Analysis
Genome-wide analyses
Ependymoma occurrence was modeled for each variant using a generalized linear mixed model association test (GMMAT) in R v4.4.1 (R Foundation for Statistical Computing). GMMAT v1.4.2 was implemented to adjust for population structure and cryptic relatedness.19 The genetic relationship matrix required for GMMAT was created using Plink2. WGS and genotyped models were adjusted for the first 2 principal components (PC) and 6 PCs, respectively. Genome-wide significant variants (P < 5 × 10^−8^) were assessed for independence by evaluating linkage disequilibrium between adjacent (±250 kilobases) SNPs (r^2^ < 0.6), and resulting significant and independent SNPs were annotated with ANNOVAR.20 Summary Manhattan and quantile-quantile (QQ) plots were created with GWASLab v3.5.7.21 GWAS model fit was assessed using the genomic inflation factor (λGC) and QQ-plots.
Results
Pediatric Variant Analysis
The GWAS of pediatric subjects with WGS assessed 5.8 million variants. Of these, one variant was independent and significant (Figure 1, Table 2, Supplementary Tables). This intronic variant was present in EDIL3 (rs149378, P = 1.9 × 10^−8^). A nearly significant supported intronic variant was harbored in LHX4 (rs79008224, P = 7.2 × 10^−8^). Cytogenetic band 20q13.33 also displayed a notable signal with two independent variants in an intron of COL9A3 (rs73157303, P = 3.0 × 10^−7^) and downstream (660 bases) of OGFR (rs910151, P = 5.9 × 10^−8^).
Genome-wide association study of whole-genome sequenced pediatric ependymoma of European ancestry. Each genome-wide association study displays the corresponding quantile-quantile plot (left) and Manhattan plot (right). The red upper dotted horizontal line indicates the genome-wide significance threshold (P = 5 × 10−8), and the green lower dotted line marks the suggestive significance threshold (P = 1 × 10−5).
Pediatric subjects with genotyping GWAS assessed 4.8 million variants. A total of three SNPs were considered independent and significant (Figure 2, Table 2, Supplementary Tables). Intronic variants were found in FAM149A (rs6852180, P = 1.8 × 10^−8^) and CYS1 (rs61052588, P = 3.0 × 10^−8^). Additionally, an intergenic SNP was adjacent (80 kilobases) to gene C1orf94 (rs1404350, P = 1.2 × 10^−14^).
Genome-wide association study of genotyped pediatric ependymoma of European ancestry. Each genome-wide association study displays the corresponding quantile-quantile plot (left) and Manhattan plot (right). The red upper dotted horizontal line indicates the genome-wide significance threshold (P = 5 × 10−8), and the green lower dotted line marks the suggestive significance threshold (P = 1 × 10−5).
Adult Variant Analysis
The adult subjects with genotyping set assessed 5.9 million variants. One SNP was significant and independent (Figure 3, Table 2, Supplementary Tables). This variant was harbored in the intron of KCNQ3 (rs79089725, P = 2.0 × 10^−8^).
Genome-wide association study of genotyped adult ependymoma of European ancestry. Each genome-wide association study displays the corresponding quantile-quantile plot (left) and Manhattan plot (right). The upper red dotted horizontal line indicates the genome-wide significance threshold (P = 5 × 10−8), and the lower green dotted line marks the suggestive significance threshold (P = 1 × 10−5).
Discussion
Ependymoma’s rarity impedes research into advancing knowledge of disease etiology and detection techniques. Our analysis represents the most extensive ependymoma-specific GWAS conducted to date and identifies several novel susceptibility loci unique to pediatric and adult cases.
Pediatric Ependymoma Variants
Multiple pediatric-specific loci and disease-associated genes were linked to ependymoma risk. The GWAS conducted in pediatric individuals with WGS data identified a significant association with EDIL3. Several cancers have been associated with EDIL3, such as breast, gastric, liver, and lung cancer.22–25 Of relevance, EDIL3 variants have been reported as potentially pathogenic in desmoplastic infantile astrocytoma and ganglioglioma, and the gene may contribute to pathways involved in Alzheimer’s disease pathogenesis.26^,^27 Notable signals were also present in the introns of LHX4 and COL9A3, as well as downstream of OGFR. LHX4 is involved in neural development, primarily in the pituitary region, and has been linked with colorectal cancer.28^,^29 A recent ependymoma study using a murine model identified LHX4 and LHX2 enrichment.30 COL9A3 has also been directly linked to ependymoma tumor invasiveness and structural development of the RELA molecular tumor subtype.31^,^32 OGFR has not previously been associated with ependymoma but is well known for the gene’s role in several malignancies as well as CNS function.33–35
In pediatric subjects analyzed by genotyping, significant variants were found in CYS1 and FAM149A. CYS1 has been associated with glioma tumorigenesis and survival.36^,^37 FAM149A has been identified as a potential prognostic marker in glioblastoma survival.38 A significant intergenic variant was located ∼80 kilobases from C1orf94, which, although largely uncharacterized, has been associated with epilepsy and glioblastoma.39^,^40
Adult Ependymoma Variants
The adult GWAS identified a single significant intronic variant in KCNQ3, a gene that has been well established for its role in epilepsy and developmental disorders, such as autism.41^,^42 KCNQ3 has been linked to metastasis in thyroid and esophageal cancer, and emerging evidence suggests it may also contribute to broader oncogenic processes and represents a potential therapeutic target.43–45
Strengths and Limitations
Key strengths of this study include its unprecedented sample size, nearly 500 patients pooled from multiple consortia, and analytic separation by age group. Employing both sequencing and imputed genotyping analyses strengthened the robustness of the findings. Rigorous statistical methodologies further support the validity of the associations.
Limitations to note include heterogeneity introduced by multiple genetic platforms and study methodologies, which may increase susceptibility to bias or confounding. Importantly, we did not observe shared significant variants between pediatric and adult analytic sets, highlighting possible biologic differences. Restriction to individuals of European ancestry limits generalizability to other ancestries. As our analysis concentrated on common SNPs, the effect of structural variants and rare mutations remains unknown. Finally, molecular and anatomical tumor subtypes, which also vary by sex, were unavailable, precluding analysis of clinical heterogeneity.
In summary, nearly 500 patients from several consortia contributed to our findings, yielding the most extensive ependymoma-specific GWAS reported to date. Several identified genes are known to be linked with cancer or neurological diseases, many of which have not previously been associated directly to ependymoma. The lack of overlapping common variants or genes between age groups suggests that ependymoma pathogenesis may differ substantially by age at disease onset. Future studies may investigate the functional impact of these susceptibility variants and evaluate their utility as biomarkers for diagnosis and prognosis.
Supplementary Material
vdag004_Supplementary_Data
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Nelles DG , Hazrati L-N. Ependymal cells and neurodegenerative disease: outcomes of compromised ependymal barrier function. Brain Commun. 2022;4:fcac 288. 10.1093/braincomms/fcac 28836415662 PMC 9677497 · doi ↗ · pubmed ↗
- 2Del Bigio MR. Ependymal cells: biology and pathology. Acta Neuropathol. 2010;119:55-73. 10.1007/s 00401-009-0624-y 20024659 · doi ↗ · pubmed ↗
- 3Rodríguez D , Cheung MC, Housri N, Quinones-Hinojosa A, Camphausen K, Koniaris LG. Outcomes of malignant CNS ependymomas: an examination of 2408 cases through the surveillance, epidemiology, and end results (SEER) database (1973–2005). J Surg Res. 2009;156:340-351. 10.1016/j.jss.2009.04.02419577759 PMC 7371344 · doi ↗ · pubmed ↗
- 4Mc Guire CS , Sainani KL, Fisher PG. Incidence patterns for ependymoma: a surveillance, epidemiology, and end results study. J Neurosurg. 2009;110:725-729. 10.3171/2008.9.JNS 0811719061350 · doi ↗ · pubmed ↗
- 5Amirian ES , Armstrong TS, Aldape KD, Gilbert MR, Scheurer ME. Predictors of survival among pediatric and adult ependymoma cases: a study using surveillance, epidemiology, and end results data from 1973 to 2007. Neuroepidemiology. 2012;39:116-124. 10.1159/00033932022846789 PMC 3470871 · doi ↗ · pubmed ↗
- 6Foss-Skiftesvik J , Stoltze UK, van Overeem Hansen T, et al. Redefining germline predisposition in children with molecularly characterized ependymoma: a population-based 20-year cohort. Acta Neuropathol Commun. 2022;10:123. 10.1186/s 40478-022-01429-136008825 PMC 9404601 · doi ↗ · pubmed ↗
- 7Kuhlen M , Golas MM, Schaller T, et al. Beyond germline genetic testing—heterozygous pathogenic variants in PMS 2 in two children with osteosarcoma and ependymoma. Hered Cancer Clin Pract. 2023;21:8. 10.1186/s 13053-023-00254-437308967 PMC 10259054 · doi ↗ · pubmed ↗
- 8Muskens IS , Zhang C, de Smith AJ, Biegel JA, Walsh KM, Wiemels JL. Germline genetic landscape of pediatric Central nervous system tumors. Neuro Oncol. 2019;21:1376-1388. 10.1093/neuonc/noz 10831247102 PMC 6827836 · doi ↗ · pubmed ↗
