# External validation of the Oncotype DX breast cancer recurrence score nomogram and development and validation of a novel machine learning-based model to predict postoperative overall survival and guide adjuvant chemotherapy in ER positive, Her-2 negative breast cancer patients: a retrospective cohort study

**Authors:** Dongdong Wang, Xinfeng Wang, Xin Yang

PMC · DOI: 10.3389/fonc.2025.1586262 · Frontiers in Oncology · 2025-05-21

## TL;DR

This study shows that a new machine learning model outperforms the existing Oncotype DX tool in predicting survival and guiding treatment for certain breast cancer patients.

## Contribution

A novel machine learning model, AORSFM, was developed and validated for predicting survival and guiding adjuvant chemotherapy in breast cancer patients.

## Key findings

- The Oncotype DX nomogram performed poorly in predicting adjuvant chemotherapy benefit in both SEER and BJH cohorts.
- The AORSFM model achieved a C-index of 0.799 in the SEER cohort and 0.793 in the BJH cohort, showing strong predictive performance.
- A web tool was developed for AORSFM, and a new staging system based on it can guide postoperative adjuvant chemotherapy decisions.

## Abstract

This study aims to externally validate the performance of the Oncotype DX (ODX) breast cancer (BC) recurrence score nomogram in predicting adjuvant chemotherapy (ACT) for BC after surgery and subsequently develop a machine learning-based model to predict postoperative overall survival (OS) and guide ACT, demonstrating superior comprehensive performance.

This analysis leveraged data from the SEER database spanning 2010-2020, alongside a BC cohort from the Beijing Hospital (BJH). Machine learning methods were applied for predictor selection by wrapper methods and the development of the predictive model. The optimal model was determined using the concordance index (C-index), time-dependent calibration curves, time dependent receiver operating characteristic (ROC) curves, and decision curve analysis (DCA). The benefit analysis of ACT was primarily conducted using Kaplan-Meier survival analysis.

The ODX nomogram performed poorly in predicting ACT benefit in both the SEER cohort and the BJH cohort. Subsequently, we employed ten machine learning methods to develop ten prognostic models. The Accelerated oblique random survival forest model (AORSFM), exhibiting the highest prediction performance, was selected. The C-index for AORSFM is 0.799 (95% CI 0.779-0.823) in the SEER cohort and 0.793 (95% CI 0.687-0.934) in the BJH cohort. Furthermore, time-dependent calibration curves, time-dependent ROC analysis, and DCA indicate that the AORSFM demonstrates good calibration, predictive accuracy, and clinical net benefit. A publicly accessible web tool was developed for the AORSFM. Notably, the new staging system based on AORSFM can provide guidance for postoperative ACT in such patients.

The AORSF has the potential to identify postoperative OS and guide ACT in patients with BC. This can assist clinicians in assessing the severity of the disease, facilitating patient follow-up, and aiding in the formulation of adjuvant treatment strategies.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Genes:** EREG (epiregulin) [NCBI Gene 2069] {aka EPR, ER, Ep}, ERBB2 (erb-b2 receptor tyrosine kinase 2) [NCBI Gene 2064] {aka CD340, HER-2, HER-2/neu, HER2, MLN 19, MLN-19}
- **Diseases:** BC (MESH:D001943)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12133539/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12133539/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/PMC12133539/full.md

---
Source: https://tomesphere.com/paper/PMC12133539