# Transcriptional patterns of cancer-related genes in primary and metastatic tumours revealed by machine learning

**Authors:** Faeze Keshavarz-Rahaghi, Erin Pleasance, Steven J. M. Jones

PMC · DOI: 10.1186/s12915-025-02339-z · 2025-08-07

## TL;DR

This study uses machine learning to uncover how changes in cancer-related genes affect gene expression in tumors, revealing patterns that could help develop targeted therapies.

## Contribution

The study introduces a novel application of random forest models to identify transcriptional patterns linked to cancer gene alterations across tumor types.

## Key findings

- Genes like TP53 and CDKN2A show consistent transcriptional patterns across cancers, while others like ATRX and BRAF are tumor-type specific.
- DRG2 is a key contributor to identifying ATRX alterations in lower-grade gliomas and is downregulated in ATRX mutant tumors.
- AURKA inhibitors are suggested as potential therapies for tumors with alterations in FBXW7 or NSD1.

## Abstract

A key to understanding cancer is to determine the impact on the cellular pathways caused by the repertoire of DNA changes accrued in a cancer cell. Exploring the interactions between genomic aberrations and the expressed transcriptome can not only improve our understanding of the disease but also identify potential therapeutic approaches.

Using random forest models, we successfully identified transcriptional patterns associated with the loss of wild-type activity in cancer-related genes across various tumour types. While genes like TP53 and CDKN2A exhibited unique pan-cancer transcriptional patterns, others like ATRX, BRAF, and NRAS showed tumour-type-specific expression patterns. We also observed that genes like AR and ERBB4 did not lead to strong detectable patterns in the transcriptome when disrupted. Our investigation has also led to the identification of genes highly associated with transcriptional patterns. For instance, DRG2 emerged as the top contributor in classification of ATRX alterations in lower-grade gliomas and was significantly downregulated in ATRX mutant tumours. Additionally, transcriptional features important in classification of PTEN aberrations, such as CDCA8, AURKA, and CDC20, were found to be closely related to PTEN function.

Our findings demonstrate the utility of machine learning in interpretation of cancer genomic data and provide new avenues for development of targeted therapies tailored to individual patients with cancer. Our analysis on the transcriptome revealed genes with expression levels strongly correlated with alterations in cancer-related genes. Additionally, we identified AURKA inhibitors as potential therapeutic option for tumours with alterations in tumour suppressors like FBXW7 or NSD1.

The online version contains supplementary material available at 10.1186/s12915-025-02339-z.

## Linked entities

- **Genes:** TP53 (tumor protein p53) [NCBI Gene 7157], CDKN2A (cyclin dependent kinase inhibitor 2A) [NCBI Gene 1029], ATRX (ATRX chromatin remodeler) [NCBI Gene 546], BRAF (B-Raf proto-oncogene, serine/threonine kinase) [NCBI Gene 673], NRAS (NRAS proto-oncogene, GTPase) [NCBI Gene 4893], AR (androgen receptor) [NCBI Gene 367], ERBB4 (erb-b2 receptor tyrosine kinase 4) [NCBI Gene 2066], DRG2 (developmentally regulated GTP binding protein 2) [NCBI Gene 1819], PTEN (phosphatase and tensin homolog) [NCBI Gene 5728], CDCA8 (cell division cycle associated 8) [NCBI Gene 55143], AURKA (aurora kinase A) [NCBI Gene 6790], CDC20 (cell division cycle 20) [NCBI Gene 991], FBXW7 (F-box and WD repeat domain containing 7) [NCBI Gene 55294], NSD1 (nuclear receptor binding SET domain protein 1) [NCBI Gene 64324]
- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** CDKN2A (cyclin dependent kinase inhibitor 2A) [NCBI Gene 1029] {aka ARF, CAI2, CDK4I, CDKN2, CMM2, INK4}, DRG2 (developmentally regulated GTP binding protein 2) [NCBI Gene 1819], BRAF (B-Raf proto-oncogene, serine/threonine kinase) [NCBI Gene 673] {aka B-RAF1, B-raf, BRAF-1, BRAF1, NS7, RAFB1}, PTEN (phosphatase and tensin homolog) [NCBI Gene 5728] {aka 10q23del, BZS, CWS1, DEC, GLM2, MHAM}, CDCA8 (cell division cycle associated 8) [NCBI Gene 55143] {aka BOR, BOREALIN, DasraB, MESRGP}, ATRX (ATRX chromatin remodeler) [NCBI Gene 546] {aka JMS, MRX52, RAD54, RAD54L, XH2, XNP}, FBXW7 (F-box and WD repeat domain containing 7) [NCBI Gene 55294] {aka AGO, CDC4, DEDHIL, FBW6, FBW7, FBX30}, ERBB4 (erb-b2 receptor tyrosine kinase 4) [NCBI Gene 2066] {aka ALS19, HER4, p180erbB4}, CDC20 (cell division cycle 20) [NCBI Gene 991] {aka CDC20A, OOMD14, OZEMA14, bA276H19.3, p55CDC}, NRAS (NRAS proto-oncogene, GTPase) [NCBI Gene 4893] {aka ALPS4, CMNS, N-ras, NCMS, NRAS1, NS6}, TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, NSD1 (nuclear receptor binding SET domain protein 1) [NCBI Gene 64324] {aka ARA267, KMT3B, SOTOS, SOTOS1, STO}, AURKA (aurora kinase A) [NCBI Gene 6790] {aka AIK, ARK1, AURA, BTAK, PPP1R47, STK15}
- **Diseases:** gliomas (MESH:D005910), cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12329921/full.md

---
Source: https://tomesphere.com/paper/PMC12329921