# Interpretable machine learning model integrating contrast-enhanced CT environmental radiomics and clinicopathological features for predicting postoperative recurrence in lung adenocarcinoma: a retrospective pilot study

**Authors:** Song Lin, Yanli Niu, Lina Song, Yingjian Ye, Jinfang Yang, Junjie Liu, Xin Zhou, Peng An

PMC · DOI: 10.3389/fonc.2025.1601674 · Frontiers in Oncology · 2025-05-23

## TL;DR

This study creates an interpretable model using CT scans and clinical data to predict lung cancer recurrence after surgery, helping guide personalized treatment.

## Contribution

The novel contribution is an interpretable machine learning model combining CECT radiomics and clinicopathological features for predicting lung adenocarcinoma recurrence.

## Key findings

- CatBoost achieved the best performance with an AUC of 0.883 in predicting recurrence.
- Radscore3 and Radscore4 from CECT images were significant predictors of recurrence.
- SHAP analysis identified heterogeneous enhancement and pleural invasion as key contributors to recurrence risk.

## Abstract

This study aims to develop an interpretable predictive model combining contrast-enhanced CT (CECT) radiomics features with clinicopathological parameters to assess 3-year recurrence risk after surgery for lung adenocarcinoma (LA).

A retrospective cohort of 350 LA patients (126 recurrence, 224 non-recurrence) from Xiangyang NO.1 People’s Hospital (2016–2023) was included. Radiomics features were extracted from arterial and venous phase CECT images using 3D Slicer’s Radiomics plugin. Features with intraclass correlation coefficient (ICC > 0.75) were selected, followed by LASSO regression with cross-validation to generate radiomics scores (Radscore3 for intratumoral and Radscore4 for peritumoral regions). Clinical variables (sex, heterogeneous enhancement, pleural invasion, Ki67) were integrated via chi-square/t-test analysis. Ten machine learning algorithms (e.g., XGBoost, CatBoost, Random Forest) were trained on a stratified 7:3 split (training: n=245; testing: n=105) with five-fold cross-validation. Model performance was evaluated using ROC curves (AUC), calibration curves, decision curve analysis (DCA), and a nomogram.

Univariate analysis identified sex (OR=1.66, p=0.02), heterogeneous enhancement (OR=4.32, p<0.05), visceral pleural invasion (OR=4.75, p<0.05), Radscore3 (OR=356.17, p<0.05), Radscore4 (OR=1529.16, p<0.05), and Ki67 (OR=1.09, p=0.01) as significant predictors. Among machine learning models, CatBoost achieved superior performance (AUC=0.883, 95% CI:0.811–0.955) compared to logistic regression (AUC=0.877, 95% CI:0.804–0.949) in test set. Calibration curves demonstrated high consistency between predicted and observed recurrence risks, while DCA indicated clinical utility at threshold probabilities >0.17. SHAP analysis highlighted heterogeneous enhancement, visceral pleural invasion, Radscore3/4, and Ki67 as key contributors. The nomogram integrated these factors, enhancing model interpretability and clinical applicability.

The CatBoost model integrating CECT environmental radiomics and clinicopathological parameters effectively predicts postoperative LA recurrence, supporting personalized adjuvant therapy decisions. Its interpretable framework emphasizes tumor heterogeneity (Radscore3/4) as a critical prognostic biomarker, providing mechanistic insights into LA recurrence.

## Linked entities

- **Diseases:** lung adenocarcinoma (MONDO:0005061)

## Full-text entities

- **Diseases:** pleural invasion (MESH:D010995), tumor (MESH:D009369), LA (MESH:D000077192)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12141000/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12141000/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/PMC12141000/full.md

---
Source: https://tomesphere.com/paper/PMC12141000