# Development and external validation of models to improve prediction of osteoporosis in elderly women: interpretable machine learning

**Authors:** Tian Tang, Shiwen Wang, Shengziyi Cai, Yun Hu

PMC · DOI: 10.3389/fendo.2025.1719698 · Frontiers in Endocrinology · 2026-01-09

## TL;DR

This study developed and validated a machine learning model to predict osteoporosis in elderly women using clinical data, showing strong performance and interpretability.

## Contribution

The novel contribution is an interpretable, externally validated machine learning model for osteoporosis prediction using routine clinical data.

## Key findings

- Random forest achieved an AUC of 0.805 in the internal test set and 0.740 in the external cohort.
- SHAP identified age, diabetes, and eGFR as top predictors of osteoporosis risk.
- The model provides calibrated probabilities and case-level explanations for clinical use.

## Abstract

As populations age and the prevalence of osteoporosis (OP) increases, osteoporotic fractures substantially raise disability and mortality and impose growing economic burdens, threatening health and quality of life. This study aimed to develop and externally validate a reliable, practical machine learning model to predict OP in older women using routine clinical test results and comorbidity data.

We retrospectively assembled an internal dataset from NHANES (2003–2020) and randomly split it 70:30 into training and test sets. An external cohort from a Chinese tertiary hospital was used for validation. Predictors were selected using LASSO in the training set. Five algorithms (XGBoost, SVM, RF, LightGBM, and Naive Bayes) were tuned, and model performance was evaluated on the test set and in the external cohort. Calibration curves and decision curve analysis (DCA) were used to assess calibration and clinical net benefit. Feature contributions were quantified with Shapley additive explanations (SHAP).

Among 3,950 women in the internal dataset, 833 (21.1%) had OP; in the external cohort (n=338), 167 (49.4%) had OP. SHAP ranked predictors (high to low) as: age, drinking, diabetes, eGFR, HbA1c, BMI, HDL, TG, BUN, and TBIL. After hyperparameter tuning, RF achieved an AUC of 0.805 in the internal test set and 0.740 in the external cohort; in the internal test set, accuracy was 0.82, precision 0.83, and specificity 0.97. Calibration was acceptable, and DCA indicated clinical utility across relevant thresholds.

A random forest model using readily available clinical data predicts osteoporosis risk in older women with robust internal and external performance. The deployed model outputs calibrated probabilities at the patient level, provides case level explanations using SHAP, and supports dynamic rescoring as new routine results become available, enabling individualized risk management in routine care.

## Linked entities

- **Chemicals:** BUN (PubChem CID 91971254), HDL (PubChem CID 6323542), TG (PubChem CID 2723601)
- **Diseases:** osteoporosis (MONDO:0005298), diabetes (MONDO:0005015)

## Full-text entities

- **Diseases:** diabetes (MESH:D003920), OP (MESH:D010024), osteoporotic fractures (MESH:D058866)
- **Chemicals:** TG (MESH:D013866)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12827086/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12827086/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/PMC12827086/full.md

---
Source: https://tomesphere.com/paper/PMC12827086