# Development and validation of a machine learning model for predicting preoperative deep vein thrombosis in elderly hip fracture patients

**Authors:** Xiaokang Wei, Zehao Yin, Shuqi Zhang, Maosheng Zhang, Dong Zhu

PMC · DOI: 10.3389/fmed.2026.1696325 · Frontiers in Medicine · 2026-02-05

## TL;DR

A machine learning model was developed to predict preoperative deep vein thrombosis in elderly hip fracture patients, aiming to improve risk assessment and clinical decision-making.

## Contribution

A novel XGBoost-based predictive model with SHAP interpretability for preoperative DVT risk in elderly hip fracture patients.

## Key findings

- XGBoost achieved an AUC of 0.829 on the training set and 0.808 on the validation set.
- The model incorporated features like D-dimer levels, hemoglobin, and APTT for DVT prediction.
- Calibration and decision curve analysis confirmed the model's clinical utility and accuracy.

## Abstract

Hip fractures in older adults pose a global health challenge. Deep vein thrombosis (DVT) is common, increasing surgical risk, delaying procedures, and causing severe thromboembolic events. It hampers recovery and lowers the quality of life. Prompt risk assessment and intervention are crucial. This study aims to develop and validate a machine learning model to predict DVT before surgery in elderly hip fracture patients, aiming to improve preoperative assessments and streamline clinical care pathways.

This study employed a retrospective design and included elderly patients who were hospitalized for hip fractures at a university-affiliated hospital between July 2022 and May 2025. A total of 782 patients met the inclusion criteria. The dataset was randomly divided into a training set (70%) and a validation set (30%). Five supervised machine learning algorithms were used to develop predictive models: decision tree (DT), extreme gradient boosting (XGBoost), support vector machine (SVM), light gradient boosting machine (LightGBM), and logistic regression (LR). Model performance was evaluated on the basis of discrimination, calibration, and clinical applicability, with SHAP analysis used for interpretability.

Among the 782 elderly patients with hip fractures, 186 (23.8%) DVT. Five features were selected for model construction: injury-to-admission time, D-dimer levels, hemoglobin levels, albumin levels, and activated partial thromboplastin time (APTT). Among all models, XGBoost achieved superior predictive accuracy, yielding an area under the receiver operating characteristic curve (AUC) of 0.829 (95% CI: 0.788–0.870) on the training set and 0.808 (95% CI: 0.742–0.874) on the validation set. Calibration curve assessment validated the model’s strong agreement between predicted and observed outcomes, and decision curve analysis (DCA) demonstrated notable clinical advantages.

The XGBoost-based predictive model for preoperative DVT in elderly patients with hip fractures demonstrated superior performance. By integrating the SHAP method to enhance model interpretability and developing an intuitive web-based tool, the model’s clinical applicability was markedly improved. This predictive tool holds promise for assisting clinicians in risk assessment and guiding medical decision-making.

## Full-text entities

- **Genes:** ALB (albumin) [NCBI Gene 213] {aka FDAHT, HSA, PRO0883, PRO0903, PRO1341}, ITIH1 (inter-alpha-trypsin inhibitor heavy chain 1) [NCBI Gene 3697] {aka H1P, IATIH, ITI-HC1, ITIH, SHAP}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, SLC17A5 (solute carrier family 17 member 5) [NCBI Gene 26503] {aka AST, ISSD, NSD, SD, SIALIN, SIASD}, GGT1 (gamma-glutamyltransferase 1) [NCBI Gene 2678] {aka CD224, D22S672, D22S732, GGT, GGT 1, GGTD}, HIF1A (hypoxia inducible factor 1 subunit alpha) [NCBI Gene 3091] {aka HIF-1-alpha, HIF-1A, HIF-1alpha, HIF1, HIF1-ALPHA, MOP1}, F2 (coagulation factor II, thrombin) [NCBI Gene 2147] {aka PT, RPRGL2, THPH1}, FGB (fibrinogen beta chain) [NCBI Gene 2244] {aka HEL-S-78p}, GGTLC5P (gamma-glutamyltransferase light chain 5 pseudogene) [NCBI Gene 653590] {aka GGT}, GPT (glutamic--pyruvic transaminase) [NCBI Gene 2875] {aka AAT1, ALT, ALT1, GPT1, SGPT}
- **Diseases:** coagulation (MESH:D001778), cerebral infarction (MESH:D002544), DVT (MESH:D020246), anemia (MESH:D000740), bone metastases (MESH:D009362), thrombosis (MESH:D013927), malnutrition (MESH:D044342), venous thromboembolism (MESH:D054556), hematological disorders (MESH:D006402), hypertension (MESH:D006973), Hip fractures (MESH:D006620), tissue injury (MESH:D017695), thromboembolic (MESH:D013923), deficiency of albumin (OMIM:194470), endothelial dysfunction (MESH:D014652), diabetes (MESH:D003920), liver cirrhosis (MESH:D008103), swelling (MESH:D004487), chronic kidney disease (MESH:D051436), Inflammatory (MESH:D007249), trauma (MESH:D014947), platelet aggregation (MESH:D001791), pain (MESH:D010146), fracture (MESH:D050723), impaired venous blood flow (MESH:D054318), PE (MESH:D011655), hypoxia (MESH:D000860), bleeding (MESH:D006470), autoimmune diseases (MESH:D001327), thrombophilia (MESH:D019851), hypoalbuminemia (MESH:D034141)
- **Chemicals:** XGBoost (-), D (MESH:D003903), heparin (MESH:D006493), alcohol (MESH:D000438), Cr (MESH:D003404), rivaroxaban (MESH:D000069552), bilirubin (MESH:D001663), oxygen (MESH:D010100), free fatty acids (MESH:D005230), fondaparinux (MESH:D000077425)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12916597/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12916597/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC12916597/full.md

---
Source: https://tomesphere.com/paper/PMC12916597