Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical   Radiology Report Generation

Liwen Sun; James Zhao; Megan Han; Chenyan Xiong

arXiv:2407.15268·cs.CL·February 7, 2025·2 cites

Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation

Liwen Sun, James Zhao, Megan Han, Chenyan Xiong

PDF

Open Access 1 Video

TL;DR

This paper presents FactMM-RAG, a fact-aware retrieval-augmented approach that improves the factual accuracy of radiology report generation by integrating high-quality reference reports using a multimodal retriever trained with factual knowledge.

Contribution

It introduces a novel fact-aware multimodal retrieval pipeline leveraging RadGraph to enhance radiology report generation accuracy without relying on explicit diagnostic labels.

Findings

01

Outperforms state-of-the-art retrievers on benchmark datasets.

02

Achieves up to 6.5% improvement in F1CheXbert scores.

03

Effectively propagates fact-aware capabilities to report generation.

Abstract

Multimodal foundation models hold significant potential for automating radiology report generation, thereby assisting clinicians in diagnosing cardiac diseases. However, generated reports often suffer from serious factual inaccuracy. In this paper, we introduce a fact-aware multimodal retrieval-augmented pipeline in generating accurate radiology reports (FactMM-RAG). We first leverage RadGraph to mine factual report pairs, then integrate factual knowledge to train a universal multimodal retriever. Given a radiology image, our retriever can identify high-quality reference reports to augment multimodal foundation models, thus enhancing the factual completeness and correctness of report generation. Experiments on two benchmark datasets show that our multimodal retriever outperforms state-of-the-art retrievers on both language generation and radiology-specific metrics, up to 6.5% and 2%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation· underline

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies