# Integrative machine learning and bioinformatics analysis to identify cellular senescence-related genes and potential therapeutic targets in ulcerative colitis and colorectal cancer

**Authors:** Tianle Xue, Yunpeng Chen, Xiaomeng Li, Zhixiang Zhou, Qiyang Chen

PMC · DOI: 10.3389/fbinf.2025.1599098 · Frontiers in Bioinformatics · 2025-07-28

## TL;DR

This study uses machine learning and bioinformatics to find genes linked to cellular senescence in ulcerative colitis and colorectal cancer, identifying potential diagnostic and therapeutic targets.

## Contribution

The study introduces an integrated machine learning model to identify senescence-related genes and therapeutic targets in the progression from UC to CRC.

## Key findings

- The turquoise module showed the strongest association with UC and CRC traits.
- Five genes (ABCB1, CXCL1, TACC3, TGFβI, and VDR) were identified as key diagnostic biomarkers with high AUC values.
- Drug enrichment and molecular docking identified potential therapeutic candidates targeting these genes.

## Abstract

Ulcerative colitis (UC) is a chronic inflammatory condition that predisposes patients to colorectal cancer (CRC) through mechanisms that remain largely undefined. Given the pivotal role of cellular senescence in both chronic inflammation and tumorigenesis, we integrated machine learning and bioinformatics approaches to identify senescence‐related biomarkers and potential therapeutic targets involved in the progression from UC to CRC.

Gene expression profiles from six GEO datasets were analyzed to identify differentially expressed genes (DEGs) using the limma package in R. Weighted gene co-expression network analysis (WGCNA) was employed to delineate modules significantly associated with UC and CRC, and the intersection of DEGs, key module genes, and senescence‐related genes from the CellAge database yielded 112 candidate genes. An integrated machine learning (IML) model—utilizing 12 algorithms with 10-fold cross-validation—was constructed to pinpoint key diagnostic biomarkers. The diagnostic performance of the candidate genes was evaluated using receiver operating characteristic (ROC) analyses in both training and validation cohorts. In addition, immune cell infiltration, protein–protein interaction (PPI) networks, and drug enrichment analyses—including molecular docking—were performed to further elucidate the biological functions and therapeutic potentials of the identified genes.

Our analysis revealed significant transcriptomic alterations in UC and CRC tissues, with the turquoise module demonstrating the strongest association with disease traits. The IML approach identified five pivotal genes (ABCB1, CXCL1, TACC3, TGFβI, and VDR) that individually exhibited AUC values > 0.7, while their combined diagnostic model achieved an AUC of 0.989. Immune infiltration analyses uncovered distinct immune profiles correlating with these biomarkers, and the PPI network confirmed robust interactions among them. Furthermore, drug enrichment and molecular docking studies identified several promising therapeutic candidates targeting these senescence‐related genes.

This study provides novel insights into the molecular interplay between cellular senescence and the UC-to-CRC transition. The identified biomarkers not only offer strong diagnostic potential but also represent promising targets for therapeutic intervention, paving the way for improved clinical management of UC-associated CRC.

## Linked entities

- **Genes:** ABCB1 (ATP binding cassette subfamily B member 1) [NCBI Gene 5243], CXCL1 (C-X-C motif chemokine ligand 1) [NCBI Gene 2919], TACC3 (transforming acidic coiled-coil containing protein 3) [NCBI Gene 10460], TGFBI (transforming growth factor beta induced) [NCBI Gene 7045], VDR (vitamin D receptor) [NCBI Gene 7421]
- **Diseases:** ulcerative colitis (MONDO:0005101), colorectal cancer (MONDO:0005575)

## Full-text entities

- **Genes:** TGFBI (transforming growth factor beta induced) [NCBI Gene 7045] {aka BIGH3, CDB1, CDG2, CDGG1, CSD, CSD1}, TACC3 (transforming acidic coiled-coil containing protein 3) [NCBI Gene 10460] {aka ERIC-1, ERIC1, Tacc4, maskin}, VDR (vitamin D receptor) [NCBI Gene 7421] {aka NR1I1, PPP1R163}, CXCL1 (C-X-C motif chemokine ligand 1) [NCBI Gene 2919] {aka FSP, GRO1, GROa, MGSA, MGSA-a, NAP-3}, ABCB1 (ATP binding cassette subfamily B member 1) [NCBI Gene 5243] {aka ABC20, CD243, CLCS, ENPAT, GP170, MDR1}
- **Diseases:** chronic inflammation (MESH:D007249), CRC (MESH:D015179), tumorigenesis (MESH:D063646), UC (MESH:D003093)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12336176/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12336176/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/PMC12336176/full.md

---
Source: https://tomesphere.com/paper/PMC12336176