# Machine learning-predicted chromatin organization landscape across pediatric tumors

**Authors:** Ketrin Gjoni, Shu Zhang, Rachel E. Yan, Bo Zhang, Daniel Miller, Jeffrey P. Greenfield, Adam Resnick, Nadia Dahmane, Katherine S. Pollard

PMC · DOI: 10.1038/s41598-026-44925-3 · Scientific Reports · 2026-03-28

## TL;DR

This study uses machine learning to predict how chromatin organization is disrupted by structural variants in pediatric tumors, identifying key regions and genes involved in cancer development.

## Contribution

The novel contribution is the application of a convolutional neural network to predict genome folding disruptions caused by SVs in diverse pediatric tumors.

## Key findings

- SVs in lymphomas, sarcomas, and germline cell tumors cause the most chromatin disruption.
- Five recurrently disrupted regions enriched for high-impact SVs were identified across multiple tumors.
- High-scoring SVs near oncogenes and novel loci were prioritized using epigenetic data integration.

## Abstract

Structural variants (SVs) are increasingly recognized as important contributors to oncogenesis through their effects on 3D genome folding. Recent advances in whole-genome sequencing have enabled large-scale profiling of SVs across diverse tumors, yet experimental characterization of their individual impact on genome folding remains infeasible. Here, we leveraged a convolutional neural network, Akita, to predict disruptions in genome folding caused by somatic SVs identified in 61 tumor types from the Children’s Brain Tumor Network dataset. Our analysis reveals significant variability in SV-induced disruptions across tumor types, with the most disruptive SVs coming from lymphomas and sarcomas, metastatic tumors, and germline cell tumors. Dimensionality reduction of disruption scores identified five recurrently disrupted regions enriched for high-impact SVs across multiple tumors. Some of these regions are highly disrupted despite not being highly mutated, and harbor tumor-associated genes and transcriptional regulators. To further interpret the functional relevance of high-scoring SVs, we integrated epigenetic data and developed a modified Activity-by-Contact scoring approach to prioritize SVs with disrupted genome contacts at active enhancers. This method highlighted highly disruptive SVs near key oncogenes, as well as novel candidate loci potentially implicated in tumorigenesis. These findings highlight the utility of machine learning for identifying novel SVs, loci, and genetic mechanisms contributing to pediatric cancers. This framework provides a foundation for future studies linking SV-driven regulatory changes to cancer pathogenesis.

The online version contains supplementary material available at 10.1038/s41598-026-44925-3.

## Full-text entities

- **Genes:** FOXJ2 (forkhead box J2) [NCBI Gene 55810] {aka FHX}, ABCB6 (ATP binding cassette subfamily B member 6 (LAN blood group)) [NCBI Gene 10058] {aka ABC, LAN, MTABC3, PRP, umat}, TERT (telomerase reverse transcriptase) [NCBI Gene 7015] {aka CMM9, DKCA2, DKCB4, EST2, PFBMFT1, TCS1}, PPP2R2C (protein phosphatase 2 regulatory subunit Bgamma) [NCBI Gene 5522] {aka B55-GAMMA, B55gamma, IMYPNO, IMYPNO1, PR52, PR55G}, ID2 (inhibitor of DNA binding 2) [NCBI Gene 3398] {aka GIG8, ID2A, ID2H, bHLHb26}, EGLN3 (egl-9 family hypoxia inducible factor 3) [NCBI Gene 112399] {aka HIFP4H3, HIFPH3, PHD3}, MYB (MYB proto-oncogene, transcription factor) [NCBI Gene 4602] {aka Cmyb, c-myb, c-myb_CDS, efg}, CCND1 (cyclin D1) [NCBI Gene 595] {aka BCL1, D11S287E, PRAD1, U21B31}, NASP (nuclear autoantigenic sperm protein) [NCBI Gene 4678] {aka FLB7527, HMDRA1, PRO1999}, MSH6 (mutS homolog 6) [NCBI Gene 2956] {aka GTBP, GTMBP, HNPCC5, HSAP, LYNCH5, MMRCS3}, RHOA (ras homolog family member A) [NCBI Gene 387] {aka ARH12, ARHA, EDFAOB, RHO12, RHOH12}, QKI (QKI, KH domain containing RNA binding) [NCBI Gene 9444] {aka Hqk, QK, QK1, QK3, hqkI}, AKT3 (AKT serine/threonine kinase 3) [NCBI Gene 10000] {aka MPPH, MPPH2, PKB-GAMMA, PKBG, PRKBG, RAC-PK-gamma}, MSH3 (mutS homolog 3) [NCBI Gene 4437] {aka DUP, FAP4, MRP1}, ZBTB18 (zinc finger and BTB domain containing 18) [NCBI Gene 10472] {aka C1DELq42q44, C1DELq43q44, C2H2-171, DEL1Q42Q44, DEL1Q43Q44, MRD22}, BCL7C (BAF chromatin remodeling complex subunit BCL7C) [NCBI Gene 9274] {aka SMARCJ3}, GNA12 (G protein subunit alpha 12) [NCBI Gene 2768] {aka HG1M1, NNX3, RMP, gep}, PDGFRA (platelet derived growth factor receptor alpha) [NCBI Gene 5156] {aka CD140A, PDGFR-2, PDGFR2}, ABL1 (ABL proto-oncogene 1, non-receptor tyrosine kinase) [NCBI Gene 25] {aka ABL, BCR-ABL, CHDSKM, JTK7, bcr/abl, c-ABL}, SMARCB1 (SWI/SNF related BAF chromatin remodeling complex subunit B1) [NCBI Gene 6598] {aka BAF47, CSS3, INI-1, INI1, MRD15, PPP1R144}, MYCN (MYCN proto-oncogene, bHLH transcription factor) [NCBI Gene 4613] {aka FGLDS1, MODED, MPAPA, MYCNsORF, MYCNsPEP, N-myc}
- **Diseases:** DUPs (MESH:C536732), Brain Tumor (MESH:D001932), pHGGs (MESH:D008228), oligodendroglioma (MESH:D009837), GCT (MESH:C537296), mental disorders (MESH:D001523), Lymphomas (MESH:D008223), ependymoma (MESH:D004806), INVs (MESH:C580205), SEGA (MESH:D001254), renal cell carcinoma (MESH:D002292), ovarian, and colorectal cancers (MESH:D010051), ganglioglioma (MESH:D018303), medulloblastoma (MESH:D008527), DNET (MESH:D018302), peripheral nerve tumors (MESH:D010524), hemangioblastoma (MESH:D018325), Ewing sarcoma (MESH:D012512), Meningiomas (MESH:D008579), embryonal tumors (MESH:D009373), rhabdomyosarcomas (MESH:D012208), sarcomas (MESH:D012509), Neuroblastomas (MESH:D009447), DIPG (MESH:D000080443), melanomas (MESH:D008545), glial-neuronal tumor (MESH:D005910), glioblastoma (MESH:D005909), gliosis (MESH:D005911), mesenchymal tumors (MESH:C535700), Tumor (MESH:D009369), leukemias (MESH:D007938), DELs (MESH:D054877), germline cell tumors (MESH:D005935), oncogenesis (MESH:D063646), ATRT (MESH:C000597569)
- **Chemicals:** CBTN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** DIPG007 — Homo sapiens (Human), Diffuse intrinsic pontine glioma, Cancer cell line (CVCL_VU70), KNS42 — Homo sapiens (Human), Glioblastoma, IDH-wildtype, Cancer cell line (CVCL_0378), D283 — Homo sapiens (Human), Medulloblastoma, non-WNT/non-SHH, group 3, Cancer cell line (CVCL_1155), BT16 — Homo sapiens (Human), Atypical teratoid/rhabdoid tumor, Cancer cell line (CVCL_M156)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13039956/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13039956/full.md

## References

8 references — full list in the complete paper: https://tomesphere.com/paper/PMC13039956/full.md

---
Source: https://tomesphere.com/paper/PMC13039956