Automated Extraction of Unstructured Post-SBRT Toxicity Data from Radiology Reports Using Large Language Models
Justin Pijanowski, Yakout Mezgueldi, Alan Lee, Drew Moghanaki, Ricky R. Savjani, James Lamb

TL;DR
This study demonstrates that large language models can accurately extract specific toxicity and progression outcomes from unstructured radiology reports in lung SBRT patients, streamlining clinical data extraction.
Contribution
We developed prompt-engineered LLM methods to reliably extract toxicity and progression data from unstructured radiology reports, showing high accuracy and viability for clinical applications.
Findings
High sensitivity and specificity in toxicity detection
Effective classification of progression status
Viability of LLMs for clinical report extraction
Abstract
We evaluated the viability of using a Large Language Model (LLM) to extract patient-specific specific toxicity and progression outcomes from unstructured radiology reports. We retrospectively extracted 160 follow-up CT and PET/CT electronic medical record notes for patients treated with lung stereotactic body radiotherapy (SBRT) at our institution from January 2017 through December 2023. Using the Llama 3.3-70-B-Instruct LLM, we engineered prompts to extract four clinical endpoints from each radiology report: locoregional progression, distant progression, radiation-induced fibrosis, and radiation-induced rib fractures. Progression endpoints were classified as yes, no, or maybe, while fibrosis and rib fractures were binary (yes or no). Ground truth labels were defined using two-grader consensus for the 60-note training set, used for prompt development, and a three-grader majority vote…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Radiomics and Machine Learning in Medical Imaging · Artificial Intelligence in Healthcare and Education
