P-1965. A Smart Way of Chart Review for Research: Utilizing Large Language Models (LLM) for Extraction of Unstructured Text Data
Ali Ejaz, Jeffrey Shu, Jarrod Dalton, Abhishek Deshpande, Ken Koon Wong

TL;DR
This paper shows how large language models can quickly and accurately extract data from unstructured stool culture reports, saving time and costs in clinical research.
Contribution
The novel use of few-shot prompt engineering with LLMs to extract structured data from stool culture reports is demonstrated.
Findings
LLMs improved accuracy from 89.21% to 99.34% through iterative prompt optimization.
Using LLMs saved an estimated 12.1 hours of manual work and reduced costs to $2.14.
The approach processed 65,703 stool culture results with high efficiency.
Abstract
Large Language Models (LLMs) have potential to improve clinical research, particularly in parsing unstructured data. Stool culture reports, typically recorded as unstructured free text, pose challenges for systematically identifying positive and negative pathogens and may require hours of manual chart review to extract relevant data fields. This limits the utilization of microbiology reports for quality improvement projects, clinical research, and public health reporting. This study evaluates the use of LLMs to transform free-text stool culture reports into structured, tabular data for large-scale analyses.Figure 1.Prompt Engineering (Few Shot Prompt)Figure 2:Performance metrics for all 3 iterations Prompt Engineering (Few Shot Prompt) Performance metrics for all 3 iterations We conducted a retrospective cohort study of stool culture reports (2010–2024) from 12 acute-care hospitals…
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Bacterial Identification and Susceptibility Testing · Machine Learning in Healthcare
