# Developing an explainable machine learning model using body composition to predict cardiovascular mortality in initial dialysis patients: a multicenter study

**Authors:** Xiao-xu Wang, Jin-xuan Wei, Tian-ke Yu, Guo-hao Zheng, Jing-yuan Cao, Min Li, Yao Wang, Shi-mei Hou, Jian Xu, Xiang-dong Yang, Bin Wang

PMC · DOI: 10.3389/fphys.2026.1769240 · Frontiers in Physiology · 2026-02-18

## TL;DR

This study created a machine learning model using body composition data from CT scans to predict cardiovascular death in patients starting dialysis.

## Contribution

The model integrates CT-based body composition and clinical features for explainable cardiovascular mortality prediction in dialysis patients.

## Key findings

- CatBoost achieved high predictive accuracy with an AUC of 0.843 in internal validation and 0.799 in external validation.
- Key predictors included age, diabetes, CVD, skeletal muscle density, and hemoglobin.
- SHAP analysis revealed CVD, skeletal muscle density, and hemoglobin as major contributors to predictions.

## Abstract

Cardiovascular disease (CVD) is the leading cause of death in patients receiving dialysis, and accurate risk prediction at dialysis initiation remains limited. We developed and validated a machine learning model integrating CT-derived body composition features to predict CVD-related mortality in initial dialysis patients.

Patients initiating dialysis between 2014 and 2020 from three tertiary hospitals were used for model training and internal validation, with patients from a fourth center for external validation. Clinical characteristics and laboratory variables were collected, and body composition parameters were assessed using opportunistic CT scans. Feature selection was performed using univariable logistic regression and LASSO regression. Eight machine learning algorithms were trained, and model performance was assessed using discrimination, calibration, and decision curve analysis. Model interpretability was evaluated using Shapley Additive Explanations (SHAP), and a web-based risk calculator was developed.

Among 1051 incident dialysis patients, 645 were assigned to the training and internal validation cohorts and 406 to the external validation cohort. Eight key predictors were identified, including age, diabetes, CVD, history of cardiac intervention, dialysis modality, skeletal muscle density, hemoglobin, and serum creatinine. CatBoost demonstrated the best performance, with an area under the receiver operating characteristic curve of 0.843 in internal validation and 0.799 in external validation, along with good calibration and clinical net benefit. SHAP analysis identified CVD, skeletal muscle density, and hemoglobin as major contributors.

An explainable machine learning model incorporating CT-derived body composition features accurately predicts CVD-related mortality in initial dialysis patients. This model may facilitate early risk stratification and targeted prevention strategies at dialysis initiation.

## Linked entities

- **Diseases:** cardiovascular disease (MONDO:0004995), diabetes (MONDO:0005015)

## Full-text entities

- **Genes:** ALB (albumin) [NCBI Gene 213] {aka FDAHT, HSA, PRO0883, PRO0903, PRO1341}, SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, EPO (erythropoietin) [NCBI Gene 2056] {aka DBAL, ECYT5, EP, MVCD2}, CST3 (cystatin C) [NCBI Gene 1471] {aka ADLDWA, ARMD11, HEL-S-2}
- **Diseases:** Atherosclerosis (MESH:D050197), cardiac death (MESH:D003643), hypertension (MESH:D006973), cardio-renal anemia syndrome (MESH:D059347), Anemia (MESH:D000740), insulin resistance (MESH:D007333), ESRD (MESH:D007676), infection (MESH:D007239), CVD death (MESH:D002318), uremic toxin (MESH:D006463), cardiac disease (MESH:D006331), hepatic failure (MESH:D017093), CAC (MESH:D003324), inflammatory bowel disease (MESH:D015212), Congestive heart failure (MESH:D006333), peripheral artery disease (MESH:D058729), type 2 diabetes (MESH:D003924), ventricular tachycardia (MESH:D017180), ventricular fibrillation (MESH:D014693), mitochondrial dysfunction (MESH:D028361), coronary heart disease (MESH:D003327), muscle (MESH:D019042), acute coronary syndrome (MESH:D054058), hyperlipidemia (MESH:D006949), Sarcopenia (MESH:D055948), inflammation (MESH:D007249), multi-system dysfunction (MESH:D015161), short bowel syndrome (MESH:D012778), hemorrhagic stroke (MESH:D000083302), edema (MESH:D004487), CKD (MESH:D051436), cardiac arrest (MESH:D006323), malignant tumors (MESH:D009369), diabetes (MESH:D003920), ischemic (MESH:D002545), dyspnea (MESH:D004417), Sudden cardiac death (MESH:D016757), chronic diarrhea (MESH:D003967), Stroke (MESH:D020521), obese (MESH:D009765), arrhythmias (MESH:D001145), SMD (MESH:D005207), decreased muscle mass (MESH:C536030)
- **Chemicals:** ACEI (-), urea nitrogen (MESH:C530477), glucose (MESH:D005947), creatinine (MESH:D003404), alcohol (MESH:D000438), TG (MESH:D014280), UA (MESH:D014527), iron (MESH:D007501), cholesterol (MESH:D002784)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12956525/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12956525/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/PMC12956525/full.md

---
Source: https://tomesphere.com/paper/PMC12956525