# Identification of novel molecular subtypes and construction of a prognostic signature via multi-omics analysis and machine learning in lung adenocarcinoma

**Authors:** Ke Ma, Jie Xu, Congyue Wang, Xu Cao, Wenjie Yu, Jingjing Xi, Xuan Zhang, Jiamin Zhan, Yang Liu, Aoyang Yu, Shuhan Liu, Yanhua Liu, Chong Chen, Xiaoli Mai

PMC · DOI: 10.3389/fonc.2025.1590216 · Frontiers in Oncology · 2025-07-21

## TL;DR

This study identifies new molecular subtypes of lung adenocarcinoma and creates a machine learning-based prognostic tool that improves survival prediction and highlights ANLN as a potential treatment target.

## Contribution

A novel multi-omics and machine learning-driven prognostic signature (MO-MLPS) is developed and validated across multiple datasets, outperforming existing signatures.

## Key findings

- The MO-MLPS outperformed 49 existing signatures and showed strong prognostic performance across multiple datasets.
- High-risk patients had significantly worse survival outcomes compared to low-risk patients.
- ANLN knockdown inhibited cancer cell proliferation and migration and improved docetaxel efficacy in vitro.

## Abstract

The development of high-throughput sequencing technologies and targeted therapeutic strategies has significantly improved the prognosis of lung adenocarcinoma (LUAD) patients with sensitive gene mutations. However, patients harboring rare or no actionable mutations were rarely benefit from these targeted therapies. This study aimed to identify novel molecular subtypes and construct a prognostic signature to enhance the stratification of LUAD prognosis.

Novel molecular subtypes of LUAD patients were identified by applying 10 distinct clustering algorithms on multi-omics data. Single-cell RNA-sequencing (scRNA-seq) data were integrated to characterize subtype-specific immune microenvironments. A multi-omics and machine learning-driven prognostic signature (MO-MLPS) was constructed in The Cancer Genome Atlas (TCGA) LUAD dataset using ten machine learning algorithms and subsequently validated across six independent datasets from the Gene Expression Omnibus (GEO) database. The robustness of the model was assessed using the concordance index (C-index), Kaplan-Meier survival analyses, receiver operating characteristic (ROC) curves, and both univariate and multivariate Cox regression analyses. We further confirmed the effects of ANLN knockdown and the expression of a domain-negative anillin protein (dnANLN) via western blotting, cell proliferation assays, flow cytometry, and transwell migration assays in vitro.

Our analysis revealed that the novel molecular subtypes exhibited differences in prognoses, biological functions, and immune infiltration profiles in LUAD. The MO-MLPS was successfully established and validated across TCGA-LUAD cohorts, six independent GEO datasets, and their composite meta-cohort. Higher risk scores from the MO-MLPS correlated with poorer prognosis in LUAD, with AUC values exceeding 0.5 at 1, 3, and 5 years across various cohorts. The signature outperformed 49 previously published prognostic signatures. Furthermore, patients classified as high risk exhibited significantly worse overall and progression-free survival than those classified as low risk. Notably, ANLN knockdown and dnANLN expression significantly inhibited cell proliferation and migration in vitro and enhanced the efficacy of docetaxel.

A comprehensive analysis of multi-omics data redefines the molecular subtype of LUAD patients. The MO-MLPS derived from subtype characteristics has the potential to serve as a clinically valuable prognostic tool. Furthermore, ANLN emerges as a promising novel therapeutic target in the treatment of LUAD.

## Linked entities

- **Genes:** ANLN (anillin, actin binding protein) [NCBI Gene 54443]
- **Chemicals:** docetaxel (PubChem CID 148124)
- **Diseases:** lung adenocarcinoma (MONDO:0005061)

## Full-text entities

- **Genes:** RHOV (ras homolog family member V) [NCBI Gene 171177] {aka ARHV, CHP, WRCH2}, CD44 (CD44 molecule (IN blood group)) [NCBI Gene 960] {aka CDW44, CSPG8, ECM-III, ECMR-III, H-CAM, HCELL}, VEGFA (vascular endothelial growth factor A) [NCBI Gene 7422] {aka L-VEGF, MVCD1, VEGF, VPF}, KRAS (KRAS proto-oncogene, GTPase) [NCBI Gene 3845] {aka 'C-K-RAS, C-K-RAS, CFC2, K-RAS2A, K-RAS2B, K-RAS4A}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, IL1B (interleukin 1 beta) [NCBI Gene 3553] {aka IL-1, IL1-BETA, IL1F2, IL1beta}, CXCR4 (C-X-C motif chemokine receptor 4) [NCBI Gene 7852] {aka CD184, D2S201E, FB22, HM89, HSY3RR, LCR1}, BRCA2 (BRCA2 DNA repair associated) [NCBI Gene 675] {aka BRCC2, BROVCA2, FACD, FAD, FAD1, FANCD}, MIF (macrophage migration inhibitory factor) [NCBI Gene 4282] {aka GIF, GLIF, MMIF}, ANLN (anillin, actin binding protein) [NCBI Gene 54443] {aka FSGS8, Scraps, scra}, C1QB (complement C1q B chain) [NCBI Gene 713] {aka C1QD2}, CD70 (CD70 molecule) [NCBI Gene 970] {aka CD27-L, CD27L, CD27LG, LPFS3, TNFSF7, TNLG8A}, TNFSF4 (TNF superfamily member 4) [NCBI Gene 7292] {aka CD134L, CD252, GP34, OX-40L, OX4OL, TNLG2B}, TNFSF15 (TNF superfamily member 15) [NCBI Gene 9966] {aka TL1, TL1A, TNLG1B, VEGI, VEGI192A}, EXO1 (exonuclease 1) [NCBI Gene 9156] {aka HEX1, hExoI}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, ANXA1 (annexin A1) [NCBI Gene 301] {aka ANX1, LPC1}, BTLA (B and T lymphocyte associated) [NCBI Gene 151888] {aka BTLA1, CD272}, ANXA5 (annexin A5) [NCBI Gene 308] {aka ANX5, CPB-I, ENX2, HEL-S-7, PP4, RPRGL3}, CD28 (CD28 molecule) [NCBI Gene 940] {aka IMD123, Tp44}, IDO1 (indoleamine 2,3-dioxygenase 1) [NCBI Gene 3620] {aka IDO, IDO-1, INDO}, PIK3CB (phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit beta) [NCBI Gene 5291] {aka P110BETA, PI3K, PI3KBETA, PIK3C1}, CD40LG (CD40 ligand) [NCBI Gene 959] {aka CD154, CD40L, HIGM1, IGM, IMD3, T-BAM}, SPP1 (secreted phosphoprotein 1) [NCBI Gene 6696] {aka BNSP, BSPI, ETA-1, OPN}, TNFSF9 (TNF superfamily member 9) [NCBI Gene 8744] {aka 4-1BB-L, CD137L, TNLG5A}, HMMR (hyaluronan mediated motility receptor) [NCBI Gene 3161] {aka CD168, IHABP, RHAMM}, CD276 (CD276 molecule) [NCBI Gene 80381] {aka 4Ig-B7-H3, B7-H3, B7H3, B7RP-2}, CD27 (CD27 molecule) [NCBI Gene 939] {aka S152, S152. LPFS2, T14, TNFRSF7, Tp55}, JUN (Jun proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 3725] {aka AP-1, AP1, c-Jun, cJUN, p39}, TNFRSF9 (TNF receptor superfamily member 9) [NCBI Gene 3604] {aka 4-1BB, CD137, CDw137, ILA, IMD109}, IFNA1 (interferon alpha 1) [NCBI Gene 3439] {aka IFL, IFN, IFN-ALPHA, IFN-alphaD, IFNA13, IFNA@}, TNFRSF14 (TNF receptor superfamily member 14) [NCBI Gene 8764] {aka ATAR, CD270, HVEA, HVEM, LIGHTR, TR2}, FPR1 (formyl peptide receptor 1) [NCBI Gene 2357] {aka FMLP, FPR}, CCNB1 (cyclin B1) [NCBI Gene 891] {aka CCNB}, CD274 (CD274 molecule) [NCBI Gene 29126] {aka ADMIO5, B7-H, B7H1, PD-L1, PDCD1L1, PDCD1LG1}, GJB3 (gap junction protein beta 3) [NCBI Gene 2707] {aka CX31, DFNA2, DFNA2B, EKV, EKVP1}, ADORA2A (adenosine A2a receptor) [NCBI Gene 135] {aka A2aR, ADORA2, RDC8}, PDCD1 (programmed cell death 1) [NCBI Gene 5133] {aka ADMIO4, AIMTBS, CD279, PD-1, PD1, SLEB2}, LGALS9 (galectin 9) [NCBI Gene 3965] {aka HUAT, LGALS9A}, FOSL1 (FOS like 1, AP-1 transcription factor subunit) [NCBI Gene 8061] {aka FRA, FRA1, fra-1}, LAG3 (lymphocyte activating 3) [NCBI Gene 3902] {aka CD223}, AKT1 (AKT serine/threonine kinase 1) [NCBI Gene 207] {aka AKT, PKB, PKB-ALPHA, PRKBA, RAC, RAC-ALPHA}, MARCO (macrophage receptor with collagenous structure) [NCBI Gene 8685] {aka SCARA2, SR-A6}, PTPRC (protein tyrosine phosphatase receptor type C) [NCBI Gene 5788] {aka B220, CD45, CD45R, GP180, IMD105, L-CA}, TNFRSF18 (TNF receptor superfamily member 18) [NCBI Gene 8784] {aka AITR, CD357, ENERGEN, GITR, GITR-D}, APOE (apolipoprotein E) [NCBI Gene 348] {aka AD2, APO-E, ApoE4, LDLCQ5, LPG}, EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}, CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, MCEMP1 (mast cell expressed membrane protein 1) [NCBI Gene 199675] {aka C19orf59}, GZMB (granzyme B) [NCBI Gene 3002] {aka C11, CCPI, CGL-1, CGL1, CSP-B, CSPB}, CD163 (CD163 molecule) [NCBI Gene 9332] {aka M130, MM130, SCARI1}, CD48 (CD48 molecule) [NCBI Gene 962] {aka BCM1, BLAST, BLAST1, MEM-102, SLAMF2, hCD48}, TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, APC (APC regulator of Wnt signaling pathway) [NCBI Gene 324] {aka BTPS2, DESMD, DP2, DP2.5, DP3, GS}, FABP4 (fatty acid binding protein 4) [NCBI Gene 2167] {aka A-FABP, AFABP, ALBP, HEL-S-104, aP2}, CD74 (CD74 molecule) [NCBI Gene 972] {aka CLIP, DHLAG, HLADG, II, Ia-GAMMA, p33}, IDO2 (indoleamine 2,3-dioxygenase 2) [NCBI Gene 169355] {aka INDOL1}, SCGB3A2 (secretoglobin family 3A member 2) [NCBI Gene 117156] {aka LU103, PNSP1, UGRP1, pnSP-1}, CD200R1 (CD200 receptor 1) [NCBI Gene 131450] {aka CD200R, HCRTR2, MOX2R, OX2R}, FOS (Fos proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 2353] {aka AP-1, C-FOS, p55}
- **Diseases:** hypoxia (MESH:D000860), Tumor Immune Dysfunction and (MESH:D007154), soft tissue (MESH:D017695), tumorigenesis (MESH:D063646), Lung cancer (MESH:D008175), adenocarcinoma (MESH:D000230), Cancer (MESH:D009369), LUAD (MESH:D000077192), inflammation (MESH:D007249), sarcomas (MESH:D012509), cytotoxicity (MESH:D064420), MO-MLPS (MESH:D007859), C1, C2 and C6 tumors (OMIM:217000), hypoxic (MESH:D002534), solid (MESH:D018250), NSCLC (MESH:D002289), tumor metastasis (MESH:D009362)
- **Chemicals:** dnANLN (-), Docetaxel (MESH:D000077143), CQ (MESH:C048021), chloroquine (MESH:D002738), MG132 (MESH:C072553)
- **Species:** Homo sapiens (human, species) [taxon 9606], gut metagenome (species) [taxon 749906]
- **Cell lines:** PC-9 — Homo sapiens (Human), Lung adenocarcinoma, Cancer cell line (CVCL_B260), NCI-H1975 — Homo sapiens (Human), Lung adenocarcinoma, Cancer cell line (CVCL_1511), BEAS-2B — Homo sapiens (Human), Transformed cell line (CVCL_0168), HCC827 — Homo sapiens (Human), Lung adenocarcinoma, Cancer cell line (CVCL_2063)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12320504/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12320504/full.md

## References

64 references — full list in the complete paper: https://tomesphere.com/paper/PMC12320504/full.md

---
Source: https://tomesphere.com/paper/PMC12320504