# Machine learning improves prediction of pulmonary thromboembolism and reduces unnecessary computed tomography scans in the emergency department

**Authors:** Sung Hyun Yoon, Cheolho Kwon, Yeongho Choi, Hyung-Jun Kim, Jihang Kim, Young Hoon Kim

PMC · DOI: 10.1038/s41598-025-34952-x · 2026-01-09

## TL;DR

A machine learning model improves the prediction of pulmonary thromboembolism and reduces unnecessary CT scans in emergency departments.

## Contribution

A machine learning model, particularly XGBoost, outperforms traditional scores in predicting PTE and reduces unnecessary imaging.

## Key findings

- XGBoost achieved an AUC of 0.814, significantly outperforming the revised Geneva score (AUC of 0.622).
- At 95% sensitivity, the XGBoost model could reduce CTPA scans by 14.8%.
- D-dimer and activated partial thromboplastin time were the most important predictors across all models.

## Abstract

The diagnosis of pulmonary thromboembolism (PTE) remains challenging due to its nonspecific clinical signs and symptoms. This study aimed to develop a machine learning (ML) model to predict PTE in emergency department patients. We retrospectively analyzed 2,525 emergency department patients suspected of PTE who underwent computed tomography pulmonary angiography (CTPA) within 7 days after elevated D-dimer levels (≥ 0.5 µg/ml) at a tertiary hospital, between January 2012 and December 2021. Clinical and laboratory data were split into training (n = 2025) and test (n = 500) sets. Six ML models—XGBoost, random forest, logistic regression, elastic net regression, support vector machine, and feed-forward neural network—were compared with the revised Geneva score using the area under the receiver operating characteristic curve (AUC). Variable importance was assessed using permutation methods. Of the 2,525 patients, 573 (22.7%) were diagnosed with PTE. XGBoost achieved the highest AUC of 0.814 (95% confidence interval [CI]: 0.759–0.862). All ML models outperformed the revised Geneva score, which had an AUC of 0.622 (95% CI: 0.563–0.675). D-dimer and activated partial thromboplastin time were the most important predictors across all ML models. At sensitivities of 100%, 95%, and 90%, the XGBoost model could reduce the number of CTPA scans by 3.0%, 14.8%, and 33.2%, respectively (all p < 0.001). These findings suggest that ML models, particularly XGBoost, can improve PTE risk prediction compared to the revised Geneva score and may help reduce unnecessary CTPA imaging in the emergency department.

The online version contains supplementary material available at 10.1038/s41598-025-34952-x.

## Full-text entities

- **Diseases:** PTE (MESH:D011655)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12873316/full.md

---
Source: https://tomesphere.com/paper/PMC12873316