431. Performance of an Expert Recommendation Framework for Blood Culture Stewardship: Comparing Clinician Manual Review and Large Language Model Automation

Nicholas P Marshall; Fatemeh Amrollahi; Manoj Maddali; Kameron Black; Aydin Zahedivash; Fateme Nateghi Haredasht; Stephen Ma; Amy Chang; Stan Deresinski; Niaz Banaei; Mary Kane Goldstein; Steven Asch; Jonathan H Chen

PMC · DOI:10.1093/ofid/ofaf695.144·January 11, 2026

431. Performance of an Expert Recommendation Framework for Blood Culture Stewardship: Comparing Clinician Manual Review and Large Language Model Automation

Nicholas P Marshall, Fatemeh Amrollahi, Manoj Maddali, Kameron Black, Aydin Zahedivash, Fateme Nateghi Haredasht, Stephen Ma, Amy Chang, Stan Deresinski, Niaz Banaei, Mary Kane Goldstein, Steven Asch, Jonathan H Chen

PDF

Open Access

TL;DR

This study compares how well doctors and an AI system can prioritize blood culture testing in emergency department patients based on infection risk.

Contribution

The novel contribution is anchoring both clinician and LLM classifications in the Fabre framework to improve precision for blood culture stewardship.

Findings

01

Manual clinician review achieved 86% sensitivity but only 57% specificity in predicting bacteremia risk.

02

LLM-based automation had high sensitivity (96%) but very low specificity (16%), over-classifying many negatives as positives.

03

A hybrid model combining LLM screening with clinician review of high-risk cases may improve accuracy and resource use.

Abstract

The 2024 blood culture bottle shortage created an urgent need to conserve supplies and prioritize high-yield testing. Institutions turned to expert frameworks like Fabre et al. (2020), which stratify bacteremia risk by clinical presentation, though these frameworks have not been evaluated at scale. In our pilot, unguided LLM queries produced high sensitivity but poor specificity, consistent with prior literature, suggesting a tendency to overestimate infection risk. To address this, we anchored both clinician and LLM classification in the Fabre framework to improve precision and enable scalable clinical decision support.Figure 1:Large Language Model (LLM)-Based Pipeline for Automated Risk Stratification of Bacteremia Large Language Model (LLM)-Based Pipeline for Automated Risk Stratification of Bacteremia Schematic diagram illustrating the structured pipeline leveraging a…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases5

bacteremia cellulitis pyelonephritis cholangitis meningitis

Figures3

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBacterial Identification and Susceptibility Testing · Sepsis Diagnosis and Treatment · Clinical Reasoning and Diagnostic Skills