# A Machine Learning–Based Scoring System to Identify High Immunoactivity Microsatellite Stability Tumors by Quantifying Similarity to Microsatellite Instability-High Tumors in Colorectal Cancers: Development and Quantitative Study

**Authors:** Hongkai Yan, Li Jiang, Yaqi Li, Fengchong Wang, Shaobo Mo, Weiqi Sheng, Dan Huang, Junjie Peng

PMC · DOI: 10.2196/66960 · JMIR Formative Research · 2025-10-16

## TL;DR

This study develops a machine learning tool to identify MSS colorectal tumors with immune features similar to MSI-H tumors, which may respond better to immunotherapy.

## Contribution

A novel machine learning-based scoring system is introduced to quantify similarity between MSS and MSI-H tumors using immune and clinical features.

## Key findings

- The model successfully identified MSS tumors with TIL distributions similar to MSI-H tumors.
- MSI-H-like MSS tumors showed no significant difference in TIL features compared to genuine MSI-H tumors.
- The score could guide personalized immunotherapy for MSS CRCs by identifying potential ICI responders.

## Abstract

Microsatellite stability (MSS) colorectal cancers (CRCs) have a limited response to immune checkpoint inhibitors (ICIs) compared to microsatellite instability-high (MSI-H) CRCs. Nevertheless, previous studies have shown that some MSS CRCs are sensitive to ICIs, although established criteria for treatment justification are still lacking.

This study aimed to test the tumor-infiltrating lymphocyte (TIL) features of MSS and develop a novel computational tool for the similarity prediction between MSS and MSI-H status in patients with CRC based on multiple factors.

We collected and analyzed data from 188 patients with CRC, including MSI status, immune cell distributions, clinical features, and gene mutations, using statistical methods and Cox regression. An ensemble machine learning–based MSI-H score was developed using stacked extreme gradient boosting classifiers to quantify the similarity of patient data to MSI-H data based on immune cell distributions, clinical features, and gene mutations. The model was robust and could address missing input data for immune cell distributions and gene mutations.

The scorer performed well (mean Cohen κ of 0.40, SD 0.05, over 10 random seeds) in identifying MSI-H–like MSS samples with TIL distributions similar to genuine MSI-H CRCs. No significant difference was observed between the TIL features of MSI-H–like MSS CRCs and MSI-H CRCs. The disparity between MSI-H–like MSS CRCs and MSS CRCs potentially lies in the T regulatory cells (P=.09) and macrophage (P=.16) populations within the tumor stromal region.

Some patients with MSS CRC presented similar immune cell distributions with high immunoactivity compared to patients with MSI-H CRC. The MSI-H score serves as a metric to quantify the similarity of MSS CRCs to MSI-H CRCs and presents a promising avenue for more personalized and effective cancer immunotherapy treatment, offering a clinical reference for potential ICI targets in MSS CRCs.

## Linked entities

- **Diseases:** colorectal cancer (MONDO:0005575)

## Full-text entities

- **Genes:** RBBP4 (RB binding protein 4, chromatin remodeling factor) [NCBI Gene 5928] {aka NURF55, RBAP48, lin-53}
- **Diseases:** CRC (MESH:D015179), H (MESH:D000848), Tumors (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12530644/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12530644/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/PMC12530644/full.md

---
Source: https://tomesphere.com/paper/PMC12530644