Development and Validation of a Large Language Model for Generating Fully-Structured Radiology Reports
Chuang Niu, Md Sayed Tanveer, Md Zabirul Islam, Parisa Kaviani, Qing Lyu, Mannudeep K. Kalra, Christopher T. Whitlow, Ge Wang

TL;DR
This study presents an open-source large language model capable of generating fully-structured radiology reports with high accuracy, addressing previous issues like formatting errors and hallucinations, and demonstrating utility in statistical analysis and nodule retrieval.
Contribution
We developed a dynamic-template-constrained decoding method for LLMs, enabling accurate, fully-structured radiology reports from free-text data across institutions, with improved performance over existing models.
Findings
Achieved about 97% F1 score on cross-institutional datasets.
Outperformed GPT-4o by 17.19% in report accuracy.
Enabled effective statistical analysis and nodule retrieval.
Abstract
Current LLMs for creating fully-structured reports face the challenges of formatting errors, content hallucinations, and privacy leakage issues when uploading data to external servers.We aim to develop an open-source, accurate LLM for creating fully-structured and standardized LCS reports from varying free-text reports across institutions and demonstrate its utility in automatic statistical analysis and individual lung nodule retrieval. With IRB approvals, our retrospective study included 5,442 de-identified LDCT LCS radiology reports from two institutions. We constructed two evaluation datasets by labeling 500 pairs of free-text and fully-structured radiology reports and one large-scale consecutive dataset from January 2021 to December 2023. Two radiologists created a standardized template for recording 27 lung nodule features on LCS. We designed a dynamic-template-constrained decoding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Lung Cancer Diagnosis and Treatment · Topic Modeling
