# Machine learning models using multimodal data accurately predict chemotherapy-induced cardiotoxicity in breast cancer

**Authors:** Kundi Chen, Yuqiong An, Zhen Wang, Fang Nie

PMC · DOI: 10.3389/fcvm.2025.1707889 · Frontiers in Cardiovascular Medicine · 2026-01-06

## TL;DR

This study uses machine learning to predict heart damage from chemotherapy in breast cancer patients using multiple types of data.

## Contribution

A novel machine learning model using multimodal data to predict chemotherapy-induced cardiotoxicity in breast cancer patients.

## Key findings

- XGBoost algorithm achieved an area under the curve of 0.782 in predicting CTRCD.
- Five key predictors of CTRCD were identified: age, baseline ejection fraction, combination therapy, chemotherapy cycles, and abnormal ECG findings.
- The model may help with early risk stratification and clinical management of heart damage in breast cancer patients.

## Abstract

Despite significant advances in breast cancer therapy, chemotherapy-related cardiac dysfunction (CTRCD) remains a critical clinical challenge. This study aimed to develop and validate machine learning (ML) models that integrate multimodal data to predict the risk of CTRCD in female breast cancer patients.

We retrospectively analyzed data from 423 female breast cancer patients who received chemotherapy between January 2020 and January 2025. Multimodal data included demographic information, clinical variables, echocardiographic parameters, electrocardiographic (ECG) findings, and cardiac biomarkers. The dataset was randomly split into training and validation sets in a 7:3 ratio. Seven feature selection methods and eight ML algorithms were employed to construct and compare predictive models.

Among the 423 patients, CTRCD occurred in 111 patients (26.24%). Five variables were identified as robust predictors: age, baseline left ventricular ejection fraction <60%, anthracycline–trastuzumab combination therapy, chemotherapy cycles, and abnormal ECG findings. Among all models evaluated, the extreme gradient boosting (XGBoost) algorithm demonstrated the best performance, achieving an area under the curve of 0.782 (95% CI: 0.681–0.883) in 10-fold cross-validation.

The XGBoost-based model showed strong predictive ability and may serve as a practical tool for early risk stratification and timely clinical management of CTRCD.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** cardiotoxicity (MESH:D066126), cardiac dysfunction (MESH:D006331), breast cancer (MESH:D001943), CTRCD (MESH:D016609)
- **Chemicals:** anthracycline (MESH:D018943), trastuzumab (MESH:D000068878)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12816329/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12816329/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12816329/full.md

---
Source: https://tomesphere.com/paper/PMC12816329