# Interpretable machine learning models based on multi-dimensional fusion data for predicting positive surgical margins in robot-assisted radical prostatectomy: a retrospective study

**Authors:** Zhangcheng Liu, Wenjun Zhou, Pan Dong, Jingyan Liu, Li Luo, Yu Luo, Shuai Su, Santigie Junior Sankoh, Yong Wang, Linhai Liu, Yang Zhang, Shilin Qiu, Lincen Jiang, Kun Han, Jindong Zhang, Jiang He, Delin Wang

PMC · DOI: 10.3389/fonc.2025.1661695 · Frontiers in Oncology · 2025-10-03

## TL;DR

This study developed interpretable machine learning models using multi-dimensional data to predict positive surgical margins in prostate cancer surgery, showing strong performance and potential for clinical use.

## Contribution

The novel contribution is the development of interpretable ML models using multi-dimensional fusion data for predicting surgical outcomes in prostate cancer.

## Key findings

- The Random Forest model achieved high AUCs (0.99 in training, 0.88 in validation, 0.97 in test sets) for predicting positive surgical margins.
- SHAP analysis identified five novel spatial anatomical features negatively associated with PSM risk.
- The model's performance was validated through five-fold and ten-fold cross-validation with consistent AUCs.

## Abstract

This study aimed to develop and validate interpretable machine learning (ML) models based on multi-dimensional fusion data for predicting positive surgical margins (PSM) in robot-assisted radical prostatectomy (RARP).

Patients who underwent RARP at our institution between January 2016 and July 2025 were enrolled. Demographic, clinical, biopsy pathology data, and MRI-derived anatomical features (measured using ITK-SNAP on axial, sagittal, and coronal planes) were collected. Feature selection was performed using intraobserver and interobserver correlation coefficients (ICCs), low-variance filtering, univariable logistic regression, Spearman’s correlation analysis, the least absolute shrinkage and selection operator (LASSO) algorithm, and the Boruta algorithm. Six ML models were constructed, with performance evaluated using area under the curve (AUC), calibration curves, and decision curve analyses (DCA) to identify the optimal model. Five-fold and ten-fold cross-validation were used to assess the optimal model’s generalizability, and its interpretability was evaluated via Shapley Additive exPlanations (SHAP) analysis.

A total of 347 patients were included, comprising a training set (n=193, January 2016–December 2024), validation set (n=84, January 2016–December 2024), and test set (n=70, January 2025–July 2025). From 164 initial features, 7 key features were retained through a four-step screening. The Random Forest (RF) model outperformed other models, achieving AUCs of 0.99 (95% CI: 0.97–1.00) in the training set, 0.88 (95% CI: 0.80–0.95) in the validation set, and 0.97 (95% CI: 0.94–1.00) in the test set. Calibration curve and decision curve analyses confirmed its strong clinical utility. Five-fold cross-validation for the RF model showed fold-specific AUCs of 0.82–0.92, with a mean AUC of 0.87 (95% CI: 0.84–0.90). Ten-fold cross-validation showed fold-specific AUCs of 0.80–0.99, with a mean AUC of 0.88 (95% CI: 0.83–0.93). SHAP analysis revealed five novel spatial anatomical features (such as Sagittal plane-posterior spatial anatomical structure index, Coronal plane-Left anatomical structure interval) were negatively associated with PSM risk, while the number of positive biopsy cores and clinical tumor stage were positively associations.

Multi-dimensional fusion data combined with ML models improves PSM prediction accuracy in RARP. The RF model, with excellent performance and interpretability, shows promise for preoperative PSM risk stratification, facilitates optimized clinical decision-making, and supports personalized treatment discussions during preoperative planning, but requires prospective and external validation before clinical implementation.

## Linked entities

- **Diseases:** prostate cancer (MONDO:0005159)

## Full-text entities

- **Diseases:** tumor (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12531042/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12531042/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/PMC12531042/full.md

---
Source: https://tomesphere.com/paper/PMC12531042