# An interpretable machine learning model combining MRI-DKI habitat radiomic features and clinical biomarkers for noninvasive prediction of lymphatic metastasis in rectal cancer: a prospective study

**Authors:** Leping Peng, Feixiang Li, Fan Zhang, Fang Ma, Xiuling Zhang, Xiaoyue Zhang, Dongdong Chen, Gang Huang, Lili Wang

PMC · DOI: 10.1186/s13244-026-02243-2 · 2026-03-25

## TL;DR

This study developed a machine learning model combining MRI-based features and clinical biomarkers to predict lymphatic metastasis in rectal cancer, improving risk assessment for better treatment decisions.

## Contribution

The study introduces a novel combined model using DKI-based habitat radiomic features and clinical immune-inflammatory biomarkers for noninvasive prediction of lymphatic metastasis in rectal cancer.

## Key findings

- Model 3 achieved AUCs of 0.937 for LVI and 0.947 for LNM in the testing cohort.
- Habitat radiomics score is a novel and robust quantitative biomarker for rectal cancer.
- The combined model outperformed other models in predicting lymphatic metastasis risk.

## Abstract

Tumor heterogeneity exerts a significant influence on lymphovascular invasion (LVI) and lymph node metastasis (LNM) in rectal cancer (RC), thereby affecting patient treatment outcomes and prognosis. This study aims to develop a combined model integrating diffusion kurtosis imaging (DKI) based habitat radiomic features with clinical immune-inflammatory biomarkers to predict lymphatic metastatic risk in RC.

This prospective study included 151 pathologically confirmed patients with rectal adenocarcinoma who underwent preoperative MRI (training cohort: 105 cases; testing cohort: 46 cases). Two radiologists manually delineated the whole-tumor VOI slice by slice on the mean diffusivity (MD) maps using ITK-SNAP software, and the VOIs were subsequently mapped onto the mean kurtosis (MK) maps. K-means clustering was applied for subregion segmentation. Predictive models for LVI and LNM were built using the Random Forest and Extra Trees algorithms, respectively. The Shapley additive explanation method was used to quantify the contribution of each feature to the decision-making of the combined model (Model 3).

Logistic regression analysis demonstrated NHR and EMVI as independent predictors of LVI, while BMI, CA19-9, PNI, and EMVI were independent predictors of LNM. Model 3, which integrated clinical immune-inflammatory biomarkers, conventional radiomic features, and habitat radiomic features, demonstrated the best performance. The AUCs for predicting LVI and LNM were 0.937 vs. 0.864 and 0.901 vs. 0.947 in the training and testing cohorts, respectively.

The habitat radiomics score is a novel and robust quantitative biomarker. Model 3 has demonstrated good performance in assessing the risk of lymphatic metastasis of RC.

Habitat radiomics features derived from DKI parameter maps, combined with clinical immune-inflammatory biomarkers, can predict the risk of lymphatic metastasis of RC, potentially complementing biopsy-based identification of high-risk regions and advancing risk stratification for clinical decision-making in RC management.

Accurate assessment of lymphatic metastasis risk in rectal cancer is crucial for clinical decision-making and personalized treatment optimization.Diffusion kurtosis imaging-derived parameters and habitat radiomic features can quantify and characterize intratumoral heterogeneity.The combined model provides higher predictive performance for LVI and LNM in rectal cancer.

Accurate assessment of lymphatic metastasis risk in rectal cancer is crucial for clinical decision-making and personalized treatment optimization.

Diffusion kurtosis imaging-derived parameters and habitat radiomic features can quantify and characterize intratumoral heterogeneity.

The combined model provides higher predictive performance for LVI and LNM in rectal cancer.

## Linked entities

- **Diseases:** rectal cancer (MONDO:0006519)

## Full-text entities

- **Genes:** CEACAM3 (CEA cell adhesion molecule 3) [NCBI Gene 1084] {aka CD66D, CEA, CGM1, CGM1a, W264, W282}, S100A1 (S100 calcium binding protein A1) [NCBI Gene 6271] {aka S100, S100-alpha, S100A}, INS (insulin) [NCBI Gene 3630] {aka IDDM, IDDM1, IDDM2, ILPR, IRDN, MODY10}, SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, PECAM1 (platelet and endothelial cell adhesion molecule 1) [NCBI Gene 5175] {aka CD31, CD31/EndoCAM, GPIIA', PECA1, PECAM-1, endoCAM}, ALB (albumin) [NCBI Gene 213] {aka FDAHT, HSA, PRO0883, PRO0903, PRO1341}, TENM1 (teneurin transmembrane protein 1) [NCBI Gene 10178] {aka ODZ1, ODZ3, TEN-M1, TEN1, TNM, TNM1}
- **Diseases:** inflammation (MESH:D007249), EMVI (MESH:D009361), CRC (MESH:D015179), necrotic (MESH:D009336), RC (MESH:D012004), LNM (MESH:D008207), rectal mucinous adenocarcinoma (MESH:D002288), MD (MESH:D008228), tumorigenesis (MESH:D063646), Obesity (MESH:D009765), Malignant tumors (MESH:D009369), rectal (MESH:D012002), hypoxia (MESH:D000860), liver metastasis (MESH:D009362), -4 (MESH:D053632), trauma (MESH:D014947), rectal adenocarcinoma (MESH:D000230)
- **Chemicals:** glucose (MESH:D005947), TG triglycerides (MESH:D014280), water (MESH:D014867), -density lipoprotein (-), cholesterol (MESH:D002784)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13018487/full.md

---
Source: https://tomesphere.com/paper/PMC13018487