# A predictive model to identify optimal candidates for surgery among patients with metastatic colorectal cancer

**Authors:** Xiqiang Zhang, Zhaoyi Jing, Longchao Wu, Ze Tao, Dandan Lu

PMC · DOI: 10.3389/fonc.2025.1573431 · Frontiers in Oncology · 2025-06-05

## TL;DR

This study developed a predictive model to help doctors decide which metastatic colorectal cancer patients would benefit most from surgery.

## Contribution

A predictive model using clinical data to identify optimal candidates for surgery in metastatic colorectal cancer patients.

## Key findings

- Surgery was associated with a significantly longer cancer-specific survival (22 vs. 12 months).
- Logistic regression outperformed most machine learning models in predicting surgical benefit.
- The model showed good accuracy with an AUC of 0.741 in the test set.

## Abstract

To improve clinical decision-making, we developed a predictive model to identify metastatic colorectal cancer (mCRC) patients who might benefit from primary tumor resection (PTR).

We extracted clinical data of stage IV CRC patients between 2010 and 2019 from the Surveillance, Epidemiology, and End Results database. Propensity score matching (PSM) was used to balance confounding factors by categorizing patients into surgery and non-surgery groups. To identify independent predictors of cancer-specific survival (CSS), we used multivariate Cox regression analysis. We further sorted patients who underwent surgery into benefit and non-benefit groups based on the median CSS of the non-surgery group; subsequently, we split the groups into training and test sets at a ratio of 6:4. To construct predictive models, we used the Boruta selection method to further filter variables, focusing on whether patients benefited from the surgery, based on key predictive factors.

We identified 23,649 mCRC patients, of whom 80.97% (19,148) underwent PTR. After PSM, compared to no surgical intervention, surgical intervention was independently associated with an extended median CSS [median: 22 vs. 12 months; HR: 2.323, P < 0.001]. Among the nine machine learning models, the Categorical Boosting model performed the best but was still slightly inferior to traditional logistic regression. The traditional logistic regression model showed good discriminative ability in both the training (area under the curve [AUC]: 0.727 [0.699-0.756]) and test (AUC: 0.741 [0.706-0.776]) sets.

We achieved a predictive model which could identify optimal candidates for PTR among mCRC patients with high accuracy.

## Linked entities

- **Diseases:** colorectal cancer (MONDO:0005575)

## Full-text entities

- **Diseases:** cancer (MESH:D009369), colorectal cancer (MESH:D015179)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12176591/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12176591/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12176591/full.md

---
Source: https://tomesphere.com/paper/PMC12176591