# CA-CAE: A deep learning-based multi-omics model for pan-cancer subtype classification and prognosis prediction

**Authors:** Shumei Zhang, Yicheng Lu, Peixian Li, Junxuan Wu, Guohua Wang, Wen Yang

PMC · DOI: 10.1371/journal.pcbi.1014015 · PLOS Computational Biology · 2026-02-20

## TL;DR

A new deep learning model called CA-CAE improves cancer subtype classification and survival prediction by integrating multiple types of biological data.

## Contribution

CA-CAE introduces a channel attention mechanism in a convolutional autoencoder to better identify cancer subtypes and prognostic genes using multi-omics data.

## Key findings

- CA-CAE successfully identified subtypes in 15 cancer types with significant survival differences.
- The model outperformed traditional and other deep learning methods in predicting survival outcomes.
- The approach highlights the importance of integrating multi-omics data for personalized cancer treatment strategies.

## Abstract

In cancer research, identifying cancer subtypes and evaluating prognosis are crucial for personalized diagnosis and treatment of cancer. With the advancement of high-throughput sequencing technologies, multi-omics data has become essential for cancer classification and prognostic analysis. By integrating deep learning techniques, it is possible to more accurately identify cancer subtypes, providing a robust basis for personalized treatment of cancer patients. In this study, we propose a convolutional autoencoder prognostic model incorporating a channel attention mechanism (CA-CAE). The model utilizes multi-omics data to predict survival-associated cancer subtypes and identify prognostic genes. We applied CA-CAE to multiple cancer types, successfully identifying subtypes in 15 distinct cancer types and revealing significant survival differences among these subtypes. Moreover, compared to traditional statistical methods and other deep learning approaches, CA-CAE demonstrated superior performance in predicting survival outcomes.

Cancer is a highly complex disease, and predicting how a patient will respond to treatment or how their disease will progress is one of the biggest challenges in modern medicine. In this study, we developed a deep learning model called CA-CAE that combines multiple types of biological data—such as gene expression, DNA methylation, and microRNA levels—to identify cancer subtypes and predict patient outcomes more accurately. Unlike traditional models that rely on a single type of data or treat all features equally, our model uses a special attention mechanism to focus on the most important biological signals. We tested this approach on data from 15 types of cancer and found that it outperformed existing methods in identifying meaningful patient subgroups and predicting survival. These findings suggest that combining different layers of biological information can provide a more complete understanding of cancer and may help guide more personalized treatment strategies in the future. Our work highlights how artificial intelligence can be used to improve cancer care by making better use of the vast data already available.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** IGHG4 (immunoglobulin heavy constant gamma 4 (G4m marker)) [NCBI Gene 3503], SFTPA2 (surfactant protein A2) [NCBI Gene 729238] {aka COLEC5, ILD2, PSAP, PSP-A, PSPA, SFTP1}, PIGR (polymeric immunoglobulin receptor) [NCBI Gene 5284], SFTPB (surfactant protein B) [NCBI Gene 6439] {aka PSP-B, SFTB3, SFTP3, SMDP1, SP-B}, S100P (S100 calcium binding protein P) [NCBI Gene 6286] {aka MIG9}, CEACAM5 (CEA cell adhesion molecule 5) [NCBI Gene 1048] {aka CD66e, CEA}, MUC5B (mucin 5B, oligomeric mucus/gel-forming) [NCBI Gene 727897] {aka MG1, MUC-5B, MUC5, MUC9}, CXCL14 (C-X-C motif chemokine ligand 14) [NCBI Gene 9547] {aka BMAC, BRAK, KEC, KS1, MIP-2g, MIP2G}, TFF3 (trefoil factor 3) [NCBI Gene 7033] {aka ITF, P1B, TFI}, CPS1 (carbamoyl-phosphate synthase 1) [NCBI Gene 1373] {aka CPS1D, CPSASE1, GATD6, PHN}, AKT1 (AKT serine/threonine kinase 1) [NCBI Gene 207] {aka AKT, PKB, PKB-ALPHA, PRKBA, RAC, RAC-ALPHA}, HLA-DRA (major histocompatibility complex, class II, DR alpha) [NCBI Gene 3122] {aka HLA-DRA1}, MSLN (mesothelin) [NCBI Gene 10232] {aka MPF, SMRP}, CTSE (cathepsin E) [NCBI Gene 1510] {aka CATE}, CRLF1 (cytokine receptor like factor 1) [NCBI Gene 9244] {aka CISS, CISS1, CLF, CLF-1, NR6, zcytor5}, AKR1C1 (aldo-keto reductase family 1 member C1) [NCBI Gene 1645] {aka 2-ALPHA-HSD, 20-ALPHA-HSD, DD1, DD1/DD2, DDH, DDH1}, SFTPA1 (surfactant protein A1) [NCBI Gene 653509] {aka COLEC4, ILD1, PSP-A, PSPA, SFTP1, SFTPA1B}, GPX2 (glutathione peroxidase 2) [NCBI Gene 2877] {aka GI-GPx, GPRP, GPRP-2, GPx-2, GPx-GI, GSHPX-GI}, PIK3CB (phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit beta) [NCBI Gene 5291] {aka P110BETA, PI3K, PI3KBETA, PIK3C1}, EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}, POGLUT3 (protein O-glucosyltransferase 3) [NCBI Gene 143888] {aka KDELC2}, NAPSA (napsin A aspartic peptidase) [NCBI Gene 9476] {aka KAP, Kdap, NAP1, NAPA, NR1H2-AS1, SNAPA}, TM4SF1 (transmembrane 4 L six family member 1) [NCBI Gene 4071] {aka H-L6, L6, M3S1, TAAL6}
- **Diseases:** COAD (MESH:D029424), autoimmune diseases (MESH:D001327), LUAD (MESH:D000077192), asthma (MESH:D001249), CPTAC (MESH:D009369), LUAD cancer (MESH:D008175), inflammation (MESH:D007249), GBM (MESH:D005909), WGD (MESH:C531766), ACC (MESH:D004476), lymph node (MESH:D000072717), viral infections (MESH:D014777), deaths (MESH:D003643), NMF (MESH:C538347), brain metastasis (MESH:D009362)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** A549 — Homo sapiens (Human), Lung adenocarcinoma, Cancer cell line (CVCL_0023)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12948314/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12948314/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12948314/full.md

---
Source: https://tomesphere.com/paper/PMC12948314