# Constructing a Prognostic Model for Subtypes of Colorectal Cancer Based on Machine Learning and Immune Infiltration‐Related Genes

**Authors:** Yue Wen, Jing Liao, Chunyan Lu, Lan Huang, Yanling Ma

PMC · DOI: 10.1111/jcmm.70437 · 2025-02-26

## TL;DR

This paper develops a machine learning model to predict survival outcomes in colorectal cancer subtypes using immune-related genes and bioinformatics analysis.

## Contribution

The novel contribution is a machine learning-based prognostic model integrating immune infiltration-related genes and multi-expense learning algorithms for CRC subtypes.

## Key findings

- The model achieved good predictive power with an AUC-ROC of C-index in cross-validation.
- Patients were stratified into high- and low-risk groups with significant differences in overall survival (p < 0.05).
- Integration of gene network features with Multi-Expense Learning algorithms improved prediction robustness.

## Abstract

This study constructed a prognostic model combining machine learning‐based immune infiltration‐related genes in each CRC subtype. We used publicly accessible gene expression data and clinical information on colorectal cancer patients. Integrated bioinformatics analysis was used for the identification of immune‐wise genes. Machine learning algorithms, like LASSO regression and random forest, were utilised to identify the most important genes that may serve as predictors for patient prognosis. Univariate Cox regression, consensus clustering as well as machine learning algorithms were conducted to construct a prognostic risk scoring model. Analysis of functional enrichment, immune infiltration analyses and copy number variations as well as mutational burdens was performed and validated at the single‐cell level. A machine learning‐based model is designed with good predictive power—an area under the receiver operating characteristic curve (AUC‐ROC) of C‐index in cross‐validation. The model also achieved good calibration and discrimination ability to stratify patients into high‐ and low‐risk groups with a statistically significant difference in OS (p < 0.05). We have integrated multiple types of gene network features into machine learning systems based on the characteristics of integrating networks with Multi‐Expense Learning algorithms, and we propose a robust approach for predicting CRC molecular subtype patient survival. This model could potentially steer personalised treatment strategies and ameliorate outcomes in patients. Although validation in other cohorts and clinical situations is necessary, it may be useful.

## Linked entities

- **Diseases:** colorectal cancer (MONDO:0005575)

## Full-text entities

- **Diseases:** CRC (MESH:D015179)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11862891/full.md

---
Source: https://tomesphere.com/paper/PMC11862891