# Explainable machine learning model of coronary artery disease combined with diabetes: development and validation study

**Authors:** Xujie Wang, Shipeng Wang, Xuhui Liu, Rongfei Xie, Shasha Liang, Biyun Wang, Yuke Zhang

PMC · DOI: 10.3389/fcvm.2025.1674287 · Frontiers in Cardiovascular Medicine · 2026-01-05

## TL;DR

This study develops a machine learning model to predict mortality in patients with both coronary artery disease and diabetes, using clinical data to improve risk assessment and decision-making.

## Contribution

A novel, validated machine learning model with an explainable nomogram for mortality prediction in patients with comorbid CAD and diabetes.

## Key findings

- The model achieved strong discrimination with AUC values of 0.846 in training and 0.824 in validation cohorts.
- Key predictors included hemoglobin, INR, albumin, NT-proBNP, and age, identified through LASSO and Cox regression.
- The nomogram provides individualized risk stratification and shows significant survival stratification by predictors.

## Abstract

Coronary artery disease (CAD) demonstrates a strong bidirectional association with diabetes mellitus, which not only elevates cardiovascular disease risk but also correlates with poorer clinical prognosis. Prognostication in patients with comorbid CAD and diabetes remains a critical clinical challenge, significantly influencing therapeutic decision-making. Leveraging readily available clinical parameters for predicting adverse outcomes in this population offers substantial clinical value. This investigation employs machine learning algorithms to develop predictive models for prognostic assessment in CAD patients with diabetes comorbidity.

We conducted a retrospective cohort study of 389 patients with comorbid coronary artery disease (CAD) and diabetes mellitus. The cohort was randomly allocated into a training set (n = 273) and an internal validation set (n = 116). Feature selection utilized LASSO regression followed by backward stepwise Cox regression analysis. A nomogram incorporating independent predictors was developed for clinical application. Model performance was assessed through discrimination metrics, calibration plots, and decision curve analysis (DCA). Random survival forest analysis validated the clinical significance of selected variables.

Our modeling approach employed a systematic methodology: LASSO regression for initial feature selection followed by backward stepwise Cox regression analysis, yielding eight independent predictors.The final model incorporated hemoglobin, INR, albumin, NT-proBNP, age, fibrinogen, diuretic use, and digitalis therapy. The integrated model demonstrated strong discriminative performance for mortality prediction across both training (AUC = 0.846, 0.838, 0.82) and validation cohorts (AUC = 0.824, 0.813, 0.798) at 3-, 5-, and 8-year intervals. Calibration plots and decision curve analysis confirmed model reliability and clinical utility over time. A nomogram was developed to facilitate individualized risk stratification. Kaplan–Meier analysis showed significant survival stratification by individual predictors, and restricted cubic spline analysis identified non-linear associations between continuous variables and mortality. Random survival forest analysis prioritized five key predictors (hemoglobin, INR, albumin, NT-proBNP, age). Comparative evaluation against the 9-variable model confirmed superior performance of the comprehensive model across all timepoints.

Our multimodal prognostic model demonstrated robust performance in predicting all-cause mortality among patients with coronary artery disease and diabetes comorbidity. The nomogram's capacity for personalized risk estimation offers potential utility in clinical decision-making and patient stratification.

## Linked entities

- **Diseases:** coronary artery disease (MONDO:0005010), diabetes mellitus (MONDO:0005015)

## Full-text entities

- **Genes:** ALB (albumin) [NCBI Gene 213] {aka FDAHT, HSA, PRO0883, PRO0903, PRO1341}, FGB (fibrinogen beta chain) [NCBI Gene 2244] {aka HEL-S-78p}
- **Diseases:** cardiovascular disease (MESH:D002318), diabetes (MESH:D003920), CAD (MESH:D003324)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12812605/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12812605/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/PMC12812605/full.md

---
Source: https://tomesphere.com/paper/PMC12812605