Contradictions in Context: Challenges for Retrieval-Augmented Generation in Healthcare
Saeedeh Javadi, Sara Mirabi, Manan Gangar, Bahadorreza Ofoghi

TL;DR
This paper examines how retrieval-augmented generation models in healthcare can produce inconsistent or outdated information, highlighting the need for better filtering strategies to ensure accuracy.
Contribution
It introduces a new benchmark dataset, evaluates multiple LLMs with controlled temporal data, and analyzes the impact of contradictions on model responses.
Findings
Contradictions between similar abstracts degrade model performance.
Retrieval similarity alone is insufficient for reliable medical RAG.
Contradiction-aware filtering is necessary for trustworthy responses.
Abstract
In high-stakes information domains such as healthcare, where large language models (LLMs) can produce hallucinations or misinformation, retrieval-augmented generation (RAG) has been proposed as a mitigation strategy, grounding model outputs in external, domain-specific documents. Yet, this approach can introduce errors when source documents contain outdated or contradictory information. This work investigates the performance of five LLMs in generating RAG-based responses to medicine-related queries. Our contributions are three-fold: i) the creation of a benchmark dataset using consumer medicine information documents from the Australian Therapeutic Goods Administration (TGA), where headings are repurposed as natural language questions, ii) the retrieval of PubMed abstracts using TGA headings, stratified across multiple publication years, to enable controlled temporal evaluation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
