DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs

Arie Cattan; Alon Jacovi; Ori Ram; Jonathan Herzig; Roee Aharoni; Sasha Goldshtein; Eran Ofek; Idan Szpektor; Avi Caciularu

arXiv:2506.08500·cs.CL·June 17, 2025

DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs

Arie Cattan, Alon Jacovi, Ori Ram, Jonathan Herzig, Roee Aharoni, Sasha Goldshtein, Eran Ofek, Idan Szpektor, Avi Caciularu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new taxonomy and benchmark for detecting and resolving conflicting information in retrieval-augmented large language models, highlighting current challenges and potential improvements.

Contribution

It proposes a novel taxonomy of knowledge conflicts, creates the CONFLICTS benchmark with expert annotations, and evaluates LLMs' ability to handle conflicting sources.

Findings

01

LLMs often struggle to resolve conflicts effectively.

02

Explicit reasoning about conflicts improves response quality.

03

Significant room for future improvements in conflict resolution.

Abstract

Retrieval Augmented Generation (RAG) is a commonly used approach for enhancing large language models (LLMs) with relevant and up-to-date information. However, the retrieved sources can often contain conflicting information and it remains unclear how models should address such discrepancies. In this work, we first propose a novel taxonomy of knowledge conflict types in RAG, along with the desired model behavior for each type. We then introduce CONFLICTS, a high-quality benchmark with expert annotations of conflict types in a realistic RAG setting. CONFLICTS is the first benchmark that enables tracking progress on how models address a wide range of knowledge conflicts. We conduct extensive experiments on this benchmark, showing that LLMs often struggle to appropriately resolve conflicts between sources. While prompting LLMs to explicitly reason about the potential conflict in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research-datasets/rag_conflicts
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Wikis in Education and Collaboration · Advanced Graph Neural Networks

MethodsLayer Normalization · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Byte Pair Encoding · Softmax · Linear Layer · Dropout · Dense Connections · Attention Is All You Need