# Developing Machine-Learning Models to Predict Bacteremia in Febrile Adults Presenting to the Emergency Department: A Retrospective Cohort Study from a Large Center

**Authors:** Chia-Ming Fu, Ike Ngo, Pak Sheung Lau, Yaroslav Ivanchuk, Fan-Ya Chou, Chih-Hung Wang, Chien-Yu Lin, Chu-Lin Tsai, Shey-Ying Chen, Tsung-Chien Lu, Hung-Yu Wei

PMC · DOI: 10.5811/westjem.35866 · 2025-05-30

## TL;DR

This study developed machine-learning models to predict bacteremia in febrile adults at emergency departments using triage data, showing strong performance that could improve diagnosis and patient outcomes.

## Contribution

The novel contribution is the development and evaluation of ML models using triage data to predict bacteremia in febrile emergency department patients.

## Key findings

- CatBoost achieved the highest AUC of 0.844 in predicting bacteremia from triage data.
- All tested machine-learning models demonstrated strong performance with AUCs above 0.82.
- The model could potentially improve emergency care by enabling early identification of bacteremia.

## Abstract

Bacteremia, a common disease but difficult to diagnose early, may result in significant morbidity and mortality without prompt treatment. We aimed to develop machine-learning (ML) algorithms to predict patients with bacteremia from febrile patients presenting to the emergency department (ED) using data that is readily available at the triage.

We included all adult patients (≥18 years of age) who presented to the emergency department (ED) of National Taiwan University Hospital (NTUH), a tertiary teaching hospital in Taiwan, with the chief complaint of fever or measured body temperature more than 38°C, and who received at least one blood culture during the ED encounter. We extracted data from the Integrated Medical Database of NTUH from 2009–2018.The dataset included patient demographics, triage details, symptoms, and medical history. The positive blood culture result of at least one potential pathogen was defined as bacteremia and used as the binary classification label. We split the dataset into training/validation and testing sets (60-to-40 ratio) and trained five supervised ML models using K-fold cross-validation. The model performance was evaluated using the area under the receiver operating characteristic curve (AUC) in the testing set.

We included 80,201 cases in this study. Of them, 48120 cases were assigned to the training/validation set and 32,081 to the testing set. Bacteremia was identified in 5,831 (12.1%) and 3,824 (11.9%) cases of the training/validation set and test set, respectively. All ML models performed well, with CatBoost achieving the highest AUC (.844, 95% confidence interval [CI] .837–.850), followed by extreme gradient boosting (.843, 95% CI .836–.849), gradient boosting (.842, 95% CI .836–.849), light gradient boosting machine (.841, 95% CI .834–.847), and random forest (.828, 95% CI .821–.834).

Our machine-learning model has shown excellent discriminatory performance to predict bacteremia based only on clinical features at ED triage. It has the potential to improve care quality and save more lives if successfully implemented in the ED.

## Linked entities

- **Diseases:** bacteremia (MONDO:0005229)

## Full-text entities

- **Diseases:** Bacteremia (MESH:D016470), fever (MESH:D005334), Febrile (MESH:D000071072)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12208070/full.md

---
Source: https://tomesphere.com/paper/PMC12208070