Prototype-Based Knowledge Guidance for Fine-Grained Structured Radiology Reporting
Chantal Pellegrini, Adrian Delchev, Ege \"Ozsoy, Nassir Navab, Matthias Keicher

TL;DR
ProtoSR leverages free-text radiology reports to enhance structured reporting accuracy by integrating a large knowledge base of visual prototypes, significantly improving fine-grained decision-making in radiology automation.
Contribution
This paper introduces ProtoSR, a novel method that uses a large-scale, automatically extracted knowledge base of visual prototypes from free-text reports to improve structured radiology report generation.
Findings
Achieves state-of-the-art results on Rad-ReStruct benchmark.
Significantly improves accuracy on detailed attribute questions.
Demonstrates the effectiveness of free-text knowledge integration.
Abstract
Structured radiology reporting promises faster, more consistent communication than free text, but automation remains difficult as models must make many fine-grained, discrete decisions about rare findings and attributes from limited structured supervision. In contrast, free-text reports are produced at scale in routine care and implicitly encode fine-grained, image-linked information through detailed descriptions. To leverage this unstructured knowledge, we propose ProtoSR, an approach for injecting free-text information into structured report population. First, we introduce an automatic extraction pipeline that uses an instruction-tuned LLM to mine 80k+ MIMIC-CXR studies and build a multimodal knowledge base aligned with a structured reporting template, representing each answer option with a visual prototype. Using this knowledge base, ProtoSR is trained to retrieve prototypes relevant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiology practices and education · Multimodal Machine Learning Applications · Topic Modeling
