P-1967. Using Secure Artificial Intelligence Agents Integrated within the Electronic Medical Record for the Evaluation of Blood Culture Appropriateness — Northern California, 2025

Guillermo Rodriguez-Nava; Timothy Keyes; Nerissa Ambers; Eugenia Miranti; Wajeeha Tariq; Erika P Viana-Cardenas; Mindy M Sampson; Jorge Salinas

PMC · DOI:10.1093/ofid/ofaf695.2134·January 11, 2026

P-1967. Using Secure Artificial Intelligence Agents Integrated within the Electronic Medical Record for the Evaluation of Blood Culture Appropriateness — Northern California, 2025

Guillermo Rodriguez-Nava, Timothy Keyes, Nerissa Ambers, Eugenia Miranti, Wajeeha Tariq, Erika P Viana-Cardenas, Mindy M Sampson, Jorge Salinas

PDF

Open Access

TL;DR

This study evaluated AI agents' ability to assess if blood cultures were appropriately ordered in real-world medical records, finding they had limited accuracy and tended to overflag orders as appropriate.

Contribution

The study introduces a novel method of using secure AI agents integrated into electronic medical records to audit blood culture appropriateness in real clinical data.

Findings

01

AI agents had a balanced accuracy of 0.568 and frequently over-flagged blood culture orders as appropriate.

02

The agents showed a tendency toward sycophantic bias, aligning with clinical notes rather than strict criteria.

03

The 'severe sepsis/septic shock' criterion was most commonly used by AI agents to justify appropriateness.

Abstract

Large language models (LLMs) have gained attention for their ability to exhibit human-like clinical reasoning with mock clinical cases. However, because of privacy concerns, few studies have evaluated their use in real-world healthcare settings. We aimed to assess the accuracy of LLMs in auditing blood culture appropriateness using real charts.Prompt Provided to Initial Reviewer AI Agent for Blood Culture Appropriateness ClassificationAI agents were guided by structured inclusion and exclusion criteria to assess blood culture appropriateness. Prompts included clinical definitions, required supporting evidence, and explicit instructions to avoid assumptions or external reasoning beyond the documentation in the clinical note. Agents were also asked to provide quoted justification for their classifications. The criteria were adapted from the Johns Hopkins Prevention Epicenter Blood Culture…

Figures4

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBacterial Identification and Susceptibility Testing · Artificial Intelligence in Healthcare and Education · Sepsis Diagnosis and Treatment