# Systematic analysis of functional genetic and epigenetic variants in colorectal cancer

**Authors:** Erfei Chen, Qiqi Yang, Haoyang Dai, Yixin Chen, Yihui Zhang, Qianglong Wang, Rongrong Hou, Ming Chen, Jie Wang, Qianwen Xie, Wenju Sun, Yong-Qiang Ning, Ligang Fan, Jian Yan

PMC · DOI: 10.1126/sciadv.aeb2473 · Science Advances · 2026-02-20

## TL;DR

This study identifies genetic and epigenetic variants that influence enhancer activity in colorectal cancer, offering new insights into disease progression and potential biomarkers for early detection.

## Contribution

The study introduces a comprehensive platform for evaluating noncoding SNPs and CpG sites in CRC, revealing metastasis-specific regulatory elements and diagnostic biomarkers.

## Key findings

- 922 SNPs and 487 CpG-containing elements modulate enhancer activity in primary CRC cells.
- 3136 SNPs and 3008 methylation-sensitive elements show metastasis-specific regulatory effects.
- Two CRC-specific hypomethylated loci (cg08640619 and cg25982657) serve as effective early detection biomarkers with AUROC > 0.96.

## Abstract

Colorectal cancer (CRC) is a leading cause of cancer-related mortality worldwide, yet the functional impact of noncoding variants on enhancer activity remains largely unexplored. In this study, we adapted and applied two high-throughput techniques, SNP-STARR-seq and Methyl-STARR-seq, to systematically evaluate the influence of 30,790 noncoding SNPs and more than 134,000 CpG sites on enhancer activity in primary and metastatic CRC cells. We identified 922 SNPs and 487 CpG-containing elements modulating enhancer activity in primary cells and found 3136 SNPs and 3008 methylation-sensitive elements with metastasis-specific regulatory effects. Multi-omics integration linked these variants to target genes, and CRISPR editing validated their roles in driving tumorigenic and metastatic phenotypes. Furthermore, we identified two CRC-specific hypomethylated loci, cg08640619 and cg25982657, as exceptional tissue-based early detection biomarkers (AUROC > 0.96). Mechanistically, hypermethylation at cg08640619 disrupts RUNX2 binding, leading to inhibition of KIRREL1 and ETV3. Our study provides a comprehensive platform for understanding how genetic and epigenetic variants disrupt transcriptional programs in CRC, offering insights into disease susceptibility and identifying potential diagnostic and therapeutic targets.

Functional screening pinpoints genetic and epigenetic drivers of colorectal cancer progression and diagnosis.

## Linked entities

- **Genes:** RUNX2 (RUNX family transcription factor 2) [NCBI Gene 860], KIRREL1 (kirre like nephrin family adhesion molecule 1) [NCBI Gene 55243], ETV3 (ETS variant transcription factor 3) [NCBI Gene 2117]
- **Diseases:** colorectal cancer (MONDO:0005575)

## Full-text entities

- **Genes:** XCL1 (X-C motif chemokine ligand 1) [NCBI Gene 6375] {aka ATAC, LPTN, LTN, SCM-1, SCM-1a, SCM1}, ADRM1 (ADRM1 26S proteasome ubiquitin receptor) [NCBI Gene 11047] {aka ARM-1, ARM1, GP110, PSMD16}, LRRC61 (leucine rich repeat containing 61) [NCBI Gene 65999] {aka HSPC295}, GAPDH (glyceraldehyde-3-phosphate dehydrogenase) [NCBI Gene 2597] {aka G3PD, GAPD, HEL-S-162eP}, JUNB (JunB proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 3726] {aka AP-1}, IDH1 (isocitrate dehydrogenase (NADP(+)) 1) [NCBI Gene 3417] {aka HEL-216, HEL-S-26, IDCD, IDH, IDP, IDPC}, RPS21 (ribosomal protein S21) [NCBI Gene 6227] {aka HLDF, S21, eS21}, CABLES2 (Cdk5 and Abl enzyme substrate 2) [NCBI Gene 81928] {aka C20orf150, dJ908M14.2, ik3-2}, TGFB1 (transforming growth factor beta 1) [NCBI Gene 7040] {aka CAEND1, CED, DPD1, IBDIMDE, LAP, TGF-beta1}, LRRC15 (leucine rich repeat containing 15) [NCBI Gene 131578] {aka LIB}, TAF4 (TATA-box binding protein associated factor 4) [NCBI Gene 6874] {aka MRD73, TAF(II)130, TAF(II)135, TAF2C, TAF2C1, TAF4A}, GFI1 (growth factor independent 1 transcriptional repressor) [NCBI Gene 2672] {aka GFI-1, GFI1A, SCN2, ZNF163}, LINC00460 (long intergenic non-protein coding RNA 460) [NCBI Gene 728192], KRAS (KRAS proto-oncogene, GTPase) [NCBI Gene 3845] {aka 'C-K-RAS, C-K-RAS, CFC2, K-RAS2A, K-RAS2B, K-RAS4A}, OSBPL2 (oxysterol binding protein like 2) [NCBI Gene 9885] {aka DFNA67, DIDA, DNFA67, ORP-2, ORP2}, HAR1A (highly accelerated region 1A) [NCBI Gene 768096] {aka HAR1F, LINC00064, NCRNA00064}, SIX2 (SIX homeobox 2) [NCBI Gene 10736], MTG2 (mitochondrial ribosome associated GTPase 2) [NCBI Gene 26164] {aka GTPBP5, ObgH1, dJ1005F21.2}, HNF1A (HNF1 homeobox A) [NCBI Gene 6927] {aka HNF-1-alpha, HNF-1A, HNF1, HNF1alpha, IDDM20, LFB1}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, ETV3 (ETS variant transcription factor 3) [NCBI Gene 2117] {aka METS, PE-1, PE1}, TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, RUNX2 (RUNX family transcription factor 2) [NCBI Gene 860] {aka AML3, CBF-alpha-1, CBFA1, CCD, CCD1, CLCD}, DNMT3A (DNA methyltransferase 3 alpha) [NCBI Gene 1788] {aka DNMT3A2, HESJAS, M.HsaIIIA, TBRS}, FOSL2 (FOS like 2, AP-1 transcription factor subunit) [NCBI Gene 2355] {aka ACED, FRA2}, DIDO1 (death inducer-obliterator 1) [NCBI Gene 11083] {aka BYE1, C20orf158, DATF-1, DATF1, DIDO2, DIDO3}, LRRC32 (leucine rich repeat containing 32) [NCBI Gene 2615] {aka CPPRDD, D11S833E, GARP}, YTHDF1 (YTH N6-methyladenosine RNA binding protein F1) [NCBI Gene 54915] {aka C20orf21, DF1}, BATF (basic leucine zipper ATF-like transcription factor) [NCBI Gene 10538] {aka B-ATF, BATF1, SFA-2, SFA2}, TCF12 (transcription factor 12) [NCBI Gene 6938] {aka CRS3, HEB, HH26, HTF4, HsT17266, TCF-12}, TCF7L2 (transcription factor 7 like 2) [NCBI Gene 6934] {aka TCF-4, TCF4}, APC (APC regulator of Wnt signaling pathway) [NCBI Gene 324] {aka BTPS2, DESMD, DP2, DP2.5, DP3, GS}, JUND (JunD proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 3727] {aka AP-1}, RBBP8NL (RBBP8 N-terminal like) [NCBI Gene 140893] {aka C20orf151}, LINC00659 (long intergenic non-protein coding RNA 659) [NCBI Gene 100652730], MYC (MYC proto-oncogene, bHLH transcription factor) [NCBI Gene 4609] {aka MRTL, MYCC, bHLHe39, c-Myc}, JUN (Jun proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 3725] {aka AP-1, AP1, c-Jun, cJUN, p39}, RNASE1 (ribonuclease A family member 1, pancreatic) [NCBI Gene 6035] {aka RAC1, RIB1, RNS1}
- **Diseases:** glioblastoma (MESH:D005909), lymph node metastasis (MESH:D008207), metastatic aggressiveness (MESH:D010554), Metastasis (MESH:D009362), COAD (MESH:D015179), tumorigenic (MESH:D002471), death (MESH:D003643), sarcoma (MESH:D012509), AML (MESH:D015470), colon adenocarcinoma (MESH:D003110), head and neck squamous cell carcinoma (MESH:D000077195), stomach adenocarcinoma (MESH:D013274), oncogenesis (MESH:D063646), Cancer (MESH:D009369), READ (MESH:D000230), metastatic (MESH:D000092182), melanoma (MESH:D008545), pancreatic adenocarcinoma (MESH:D010190)
- **Chemicals:** LiCl (MESH:D018021), A (MESH:D001151), H (MESH:D006859), KCl (MESH:D011189), formaldehyde (MESH:D005557), CO2 (MESH:D002245), Agarose (MESH:D012685), oligodeoxynucleotide (MESH:D009838), paraformaldehyde (MESH:C003043), crystal violet (MESH:D005840), FIMO (-), puromycin (MESH:D011691), glycerol (MESH:D005990), Hepes (MESH:D006531), SDS (MESH:D012967), oligonucleotides (MESH:D009841), dithiothreitol (MESH:D004229), glycine (MESH:D005998), H2O (MESH:D014867), TRIzol (MESH:C411644), EDTA (MESH:D004492), sodium bisulfite (MESH:C009279), IGEPAL CA-630 (MESH:C010615), Triton X-100 (MESH:D017830), sodium deoxycholate (MESH:D003840), NaCl (MESH:D012965), Zeocin (MESH:C105427)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090], Escherichia coli (E. coli, species) [taxon 562]
- **Mutations:** A to C, rs6061231, rs67941624, rs6983267, G/A, rs1962004, T/T, rs67941642
- **Cell lines:** SW480 — Homo sapiens (Human), Colon adenocarcinoma, Cancer cell line (CVCL_0546), DH5alpha — Drosophila hydei (Fruit fly), Spontaneously immortalized cell line (CVCL_Z531), SW620 — Homo sapiens (Human), Colon adenocarcinoma, Cancer cell line (CVCL_0547), HEK — Homo sapiens (Human), Human papillomavirus-related endocervical adenocarcinoma, Cancer cell line (CVCL_M624), HCT-116 — Homo sapiens (Human), Colon carcinoma, Cancer cell line (CVCL_0291), 293T — Homo sapiens (Human), Transformed cell line (CVCL_0063), GT115 — Homo sapiens (Human), Spinocerebellar ataxia type 1, Induced pluripotent stem cell (CVCL_ZA12), pLKO.1 — Mus musculus (Mouse), Hybridoma (CVCL_C7RB)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12922747/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12922747/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/PMC12922747/full.md

---
Source: https://tomesphere.com/paper/PMC12922747