Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation
Liwen Sun, James Zhao, Megan Han, Chenyan Xiong

TL;DR
This paper presents FactMM-RAG, a fact-aware retrieval-augmented approach that improves the factual accuracy of radiology report generation by integrating high-quality reference reports using a multimodal retriever trained with factual knowledge.
Contribution
It introduces a novel fact-aware multimodal retrieval pipeline leveraging RadGraph to enhance radiology report generation accuracy without relying on explicit diagnostic labels.
Findings
Outperforms state-of-the-art retrievers on benchmark datasets.
Achieves up to 6.5% improvement in F1CheXbert scores.
Effectively propagates fact-aware capabilities to report generation.
Abstract
Multimodal foundation models hold significant potential for automating radiology report generation, thereby assisting clinicians in diagnosing cardiac diseases. However, generated reports often suffer from serious factual inaccuracy. In this paper, we introduce a fact-aware multimodal retrieval-augmented pipeline in generating accurate radiology reports (FactMM-RAG). We first leverage RadGraph to mine factual report pairs, then integrate factual knowledge to train a universal multimodal retriever. Given a radiology image, our retriever can identify high-quality reference reports to augment multimodal foundation models, thus enhancing the factual completeness and correctness of report generation. Experiments on two benchmark datasets show that our multimodal retriever outperforms state-of-the-art retrievers on both language generation and radiology-specific metrics, up to 6.5% and 2%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies
