TL;DR
RadioRAG enhances radiology question answering by integrating real-time data retrieval from authoritative sources, significantly improving the diagnostic accuracy of various large language models across multiple radiologic subspecialties.
Contribution
We developed RadioRAG, an end-to-end framework that retrieves real-time radiology data, improving LLM diagnostic accuracy and surpassing previous fixed-database RAG systems.
Findings
RadioRAG increased LLM accuracy by up to 54%.
It matched or exceeded human radiologists in accuracy.
Effectiveness varied among different LLMs.
Abstract
Large language models (LLMs) often generate outdated or inaccurate information based on static training datasets. Retrieval-augmented generation (RAG) mitigates this by integrating outside data sources. While previous RAG systems used pre-assembled, fixed databases with limited flexibility, we have developed Radiology RAG (RadioRAG), an end-to-end framework that retrieves data from authoritative radiologic online sources in real-time. We evaluate the diagnostic accuracy of various LLMs when answering radiology-specific questions with and without access to additional online information via RAG. Using 80 questions from the RSNA Case Collection across radiologic subspecialties and 24 additional expert-curated questions with reference standard answers, LLMs (GPT-3.5-turbo, GPT-4, Mistral-7B, Mixtral-8x7B, and Llama3 [8B and 70B]) were prompted with and without RadioRAG in a zero-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · WordPiece · Cosine Annealing · Label Smoothing · BERT · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Linear Warmup With Cosine Annealing
