# Development and validation of an interpretable machine learning model for acute radiation dermatitis in breast cancer

**Authors:** Xuejuan Duan, Yadong Liu, Yuguang Shang, Xiaomeng Lu, Yanhong Zhou, Liguo Liu, Zhikun Liu

PMC · DOI: 10.3389/fonc.2025.1663293 · 2025-10-17

## TL;DR

This study develops an interpretable machine learning model to predict acute radiation dermatitis in breast cancer patients, aiming to improve treatment planning and reduce severe side effects.

## Contribution

The novel contribution is an interpretable machine learning model for predicting acute radiation dermatitis with key predictors identified via SHAP analysis.

## Key findings

- A random forest model achieved an AUC of 0.84 in training and 0.748 in testing for predicting acute radiation dermatitis.
- SHAP analysis identified CTVsc, CTVim, TNM stage II, and diabetic status as key predictors of radiation dermatitis.
- The model showed better net benefits than 'treat-all' or 'treat-none' strategies at treatment thresholds of 25%–75%.

## Abstract

Radiation dermatitis (RD), a common adverse reaction in breast cancer radiotherapy, impairs quality of life and increases healthcare burdens. Developing an effective risk prediction model is crucial for early high-risk patient identification and preventive interventions.

This study enrolled 691 breast cancer patients undergoing postoperative radiotherapy at our center from February 1 to December 19, 2024. RD severity and correlates were monitored during and 2 weeks after radiotherapy. The dataset was divided into training (n=552) and test (n=139) cohorts. Fourteen machine learning algorithms were evaluated via 10-fold cross-validation, with model selection based on Area Under the Curve (AUC) and other metrics. Model reliability was verified using an internal hold-out test set, and SHAP analysis ensured interpretability.

Among 691 patients,52.68% (n=364) developed grade ≥2 acute RD. The random forest model performed best, achieving an AUC of 0.84 (95% CI: 0.807–0.873) in training and 0.748 (0.665–0.831) in testing, with training/testing sensitivity/specificity of 0.811/0.747 and 0.877/0.576, respectively. Calibration curves confirmed prediction-observation consistency. Decision curve analysis indicated 0.2–0.4 higher net benefits than “treat-all” or “treat-none” strategies at 25%–75% treatment thresholds. Shapley Additive exPlanations (SHAP) analysis identified Clinical Target Volume-Supraclavicular (CTVsc), Clinical Target Volume-Internal Mammary (CTVim), TNM stage II, and diabetic status as key predictors.

This explainable machine learning model demonstrates robust discriminative power and clinical utility. Interpretability analysis revealed feature nonlinearities, providing a theoretical basis for personalized radiotherapy planning to reduce severe RD risk.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** breast cancer (MESH:D001943), RD (MESH:D011855), acute RD (MESH:D054508), diabetic (MESH:D003920)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12575144/full.md

---
Source: https://tomesphere.com/paper/PMC12575144