# Development and Validation of Machine Learning Models for Predicting Falls Among Hospitalized Older Adults: Retrospective Cross-Sectional Study

**Authors:** Xiyao Yang, Juan Ren, Dan Su, Manzhen Bao, Miao Zhang, Xiaoming Chen, Yanhua Li, Zonggui Wang, Xiujing Dai, Zengzeng Wei, Shuiyu Zhang, Yuxin Zhang, Juan Li, Xiaolin Li, Junjin Xu, Nan Mo

PMC · DOI: 10.2196/80602 · JMIR Aging · 2026-01-05

## TL;DR

This study develops a machine learning model to predict fall risk in hospitalized older adults, using factors like medical history and mobility.

## Contribution

The novel contribution is the development and validation of a gradient boosting machine model with SHAP interpretation for fall prediction in clinical settings.

## Key findings

- The gradient boosting machine model achieved a C-index of 0.744 for predicting falls.
- Eight key variables, including dizziness and fall history, were identified as most important for fall risk.
- SHAP interpretation improved the model's clinical applicability and understanding.

## Abstract

Falls are one of the leading causes of injury or death among older adults. Falls occurring in individuals during hospitalization, as an adverse event, are a key concern for health care institutions. Identifying older adults at high risk of falls in clinical settings enables early interventions, thereby reducing the incidence of falls.

This study aims to develop and validate machine learning models to predict the risk of falls among hospitalized older adults.

This study retrospectively analyzed data from a tertiary general hospital in China, including 342 older adults who experienced falls and 684 randomly matched nonfallers, between January 2018 and December 2024, encompassing demographic information, comorbidities, laboratory parameters, and medication use, among other variables. The dataset was randomly split into training and testing sets in a 7:3 ratio. Predictors were selected from the training set using stepwise regression, least absolute shrinkage and selection operator, and random forest-recursive feature elimination. Seven machine learning algorithms were employed to develop predictive models in the training set, and their performance was compared in the testing set. The optimal model was interpreted using Shapley Additive Explanations (SHAP).

The gradient boosting machine model demonstrated the best predictive performance (C-index 0.744, 95% CI 0.688‐0.799). The 8 most important variables associated with fall risk were dizziness, epilepsy, fall history within the past 3 months, use of walking assistance, emergency admission, Morse Fall Scale scores, modified Barthel Index scores, and the number of indwelling catheters. The model was interpreted using SHAP to enhance the clinical utility of the predictive model.

The gradient boosting machine model was identified as the optimal predictive model. The SHAP method enhanced its integration into clinical workflows.

## Linked entities

- **Diseases:** epilepsy (MONDO:0005027)

## Full-text entities

- **Diseases:** injury (MESH:D014947), death (MESH:D003643), Fall (MESH:C537863), dizziness (MESH:D004244), epilepsy (MESH:D004827)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12767673/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12767673/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC12767673/full.md

---
Source: https://tomesphere.com/paper/PMC12767673