# Development and validation of a machine learning-based risk prediction model for sarcopenia in community hospital patients: a retrospective cohort study

**Authors:** Xue Zhao, Wang Yao, Jiawei Shen, Xinyu Tang, Jue Zheng, Chang Guo, Sun Ye, Miqiong Li, Chao Wang, Peihao Yin

PMC · DOI: 10.3389/fragi.2026.1772792 · Frontiers in Aging · 2026-02-26

## TL;DR

This study developed and validated machine learning models to predict sarcopenia risk in community hospital patients, showing high accuracy and identifying key risk factors.

## Contribution

The novel contribution is the development of highly accurate ML models for sarcopenia risk prediction using clinical and demographic data.

## Key findings

- CatBoost, LightGBM, and Gradient Boosting models achieved AUROC values of 0.999, 0.996, and 0.995, respectively.
- SARC_Cal_score, BMI, and age were identified as the most influential predictors of sarcopenia.
- Higher chronic disease burden was positively associated with sarcopenia risk.

## Abstract

Sarcopenia, a progressive age-related loss of skeletal muscle mass and strength, represents a growing public health challenge amid global population aging. Early detection remains difficult with conventional diagnostic approaches.

This study aimed to develop and validate reliable machine learning (ML) models to identify key risk factors for sarcopenia in community hospital settings. Using retrospective data from 1,650 patients at a community health center, we collected comprehensive demographic, clinical, and lifestyle variables. Twelve ML models—including Random Forest, Support Vector Machine, XGBoost, and Logistic Regression—were constructed and evaluated using 5-fold cross validation.

The CatBoost, LightGBM, and Gradient Boosting Decision Tree models demonstrated superior predictive performance, with area under the receiver operating characteristic curve (AUROC) values of 0.999, 0.996, and 0.995, respectively. SHapley Additive exPlanations (SHAP) analysis revealed that SARC_Cal_score, body mass index (BMI), and age belong to the most influential predictors, while a greater chronic disease burden was positively associated with sarcopenia risk.

In conclusion, ML models show substantial potential for clinical application in identifying sarcopenia risk, thereby supporting early intervention strategies. This approach enhances detection capabilities and provides a practical tool for individualized treatment planning in community-based elderly care. Future research should integrate additional biomarkers and environmental factors to further improve model accuracy and facilitate integration into clinical workflows.

## Full-text entities

- **Diseases:** Sarcopenia (MESH:D055948), loss of skeletal muscle mass and (MESH:C536030)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12979493/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12979493/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/PMC12979493/full.md

---
Source: https://tomesphere.com/paper/PMC12979493