# Development of an interpretable machine learning model for predicting sarcopenia in patients undergoing maintenance hemodialysis

**Authors:** Shuqin Liu, Xingyu Zhu, Zhixin Wang, Wenwu Tang, Ying Zhang, Huaming Xian, Mi Li, Xisheng Xie

PMC · DOI: 10.3389/fmed.2025.1576081 · Frontiers in Medicine · 2025-11-04

## TL;DR

This study creates an interpretable machine learning model to predict sarcopenia in hemodialysis patients using clinical data, aiming to improve early detection and personalized treatment.

## Contribution

The novel contribution is the development of an interpretable Logistic Regression model for sarcopenia prediction in MHD patients using SHAP for enhanced transparency.

## Key findings

- The Logistic Regression model achieved an AUC of 0.828 for sarcopenia prediction in MHD patients.
- Key predictors included BMI, age, gender, creatinine, vitamin D levels, LVEF, and eGFR.
- SHAP analysis identified high BMI and vitamin D as protective factors, while low creatinine and eGFR increased sarcopenia risk.

## Abstract

Sarcopenia has a high incidence among patients undergoing maintenance hemodialysis (MHD), significantly increasing the risk of falls, fractures, and mortality. Traditional diagnostic methods, however, are costly and complex, limiting their widespread clinical application. Therefore, developing an efficient and interpretable sarcopenia prediction model using routine clinical and laboratory data is crucial, with explainability techniques applied to further enhance model transparency.

This study included 256 MHD patients and developed five machine learning models based on clinical and laboratory data: Logistic Regression, Extreme Gradient Boosting, Random Forest, Support Vector Machine, and Gaussian Naive Bayes. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration curve, and decision curve analysis. Additionally, SHapley Additive exPlanations (SHAP) were employed as an explainability tool to enhance and visualize the interpretability of the optimal model.

The Logistic Regression model demonstrated the best performance on the validation set (AUC = 0.828, 95% CI: 0.626–0.989). Key predictive factors included body mass index (BMI), age, gender, creatinine (Cr), 25-hydroxyvitamin D3, left ventricular ejection fraction (LVEF), and estimated glomerular filtration rate (eGFR). SHAP analysis revealed that high BMI and 25-hydroxyvitamin D3 levels were protective factors, while low Cr, LVEF, and eGFR levels, as well as female gender, significantly increased the risk of sarcopenia.

This study developed a Logistic Regression model using an interpretable machine learning approach, offering an efficient tool for early screening of sarcopenia risk in MHD patients and facilitating personalized intervention strategies. However, the single-center design limits the model’s external applicability, and further multi-center studies are necessary to validate its generalizability.

## Linked entities

- **Chemicals:** 25-hydroxyvitamin D3 (PubChem CID 5283731)

## Full-text entities

- **Diseases:** fractures (MESH:D050723), falls (MESH:C537863), Sarcopenia (MESH:D055948)
- **Chemicals:** Cr (MESH:D003404), 25-hydroxyvitamin D3 (MESH:D002112)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12623174/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12623174/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12623174/full.md

---
Source: https://tomesphere.com/paper/PMC12623174