Tracking Cancer Through Text: Longitudinal Extraction From Radiology Reports Using Open-Source Large Language Models
Luc Builtjes, Alessa Hering

TL;DR
This paper introduces an open-source, locally deployable pipeline utilizing large language models to extract and link longitudinal tumor data from radiology reports, demonstrating high accuracy and privacy preservation.
Contribution
It presents a novel open-source system using LLMs for longitudinal oncology data extraction from radiology reports, ensuring privacy and reproducibility.
Findings
High attribute-level accuracy for lesion extraction (over 93%)
Effective longitudinal lesion linking across time points
Open-source LLMs can perform clinically meaningful tasks
Abstract
Radiology reports capture crucial longitudinal information on tumor burden, treatment response, and disease progression, yet their unstructured narrative format complicates automated analysis. While large language models (LLMs) have advanced clinical text processing, most state-of-the-art systems remain proprietary, limiting their applicability in privacy-sensitive healthcare environments. We present a fully open-source, locally deployable pipeline for longitudinal information extraction from radiology reports, implemented using the llm_extractinator framework. The system applies the qwen2.5-72b model to extract and link target, non-target, and new lesion data across time points in accordance with RECIST criteria. Evaluation on 50 Dutch CT Thorax/Abdomen report pairs yielded high extraction performance, with attribute-level accuracies of 93.7% for target lesions, 94.9% for non-target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Biomedical Text Mining and Ontologies
