RadioRAG: Online Retrieval-augmented Generation for Radiology Question Answering

Soroosh Tayebi Arasteh; Mahshad Lotfinia; Keno Bressem; Robert Siepmann; Lisa Adams; Dyke Ferber; Christiane Kuhl; Jakob Nikolas Kather; Sven Nebelung; Daniel Truhn

arXiv:2407.15621·cs.CL·June 19, 2025

RadioRAG: Online Retrieval-augmented Generation for Radiology Question Answering

Soroosh Tayebi Arasteh, Mahshad Lotfinia, Keno Bressem, Robert Siepmann, Lisa Adams, Dyke Ferber, Christiane Kuhl, Jakob Nikolas Kather, Sven Nebelung, Daniel Truhn

PDF

1 Repo

TL;DR

RadioRAG enhances radiology question answering by integrating real-time data retrieval from authoritative sources, significantly improving the diagnostic accuracy of various large language models across multiple radiologic subspecialties.

Contribution

We developed RadioRAG, an end-to-end framework that retrieves real-time radiology data, improving LLM diagnostic accuracy and surpassing previous fixed-database RAG systems.

Findings

01

RadioRAG increased LLM accuracy by up to 54%.

02

It matched or exceeded human radiologists in accuracy.

03

Effectiveness varied among different LLMs.

Abstract

Large language models (LLMs) often generate outdated or inaccurate information based on static training datasets. Retrieval-augmented generation (RAG) mitigates this by integrating outside data sources. While previous RAG systems used pre-assembled, fixed databases with limited flexibility, we have developed Radiology RAG (RadioRAG), an end-to-end framework that retrieves data from authoritative radiologic online sources in real-time. We evaluate the diagnostic accuracy of various LLMs when answering radiology-specific questions with and without access to additional online information via RAG. Using 80 questions from the RSNA Case Collection across radiologic subspecialties and 24 additional expert-curated questions with reference standard answers, LLMs (GPT-3.5-turbo, GPT-4, Mistral-7B, Mixtral-8x7B, and Llama3 [8B and 70B]) were prompted with and without RadioRAG in a zero-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tayebiarasteh/radiorag
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · WordPiece · Cosine Annealing · Label Smoothing · BERT · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Linear Warmup With Cosine Annealing