ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM
Zhaochen Su, Jun Zhang, Xiaoye Qu, Tong Zhu, Yanshu Li, Jiashuo Sun,, Juntao Li, Min Zhang, Yu Cheng

TL;DR
ConflictBank is a comprehensive benchmark designed to evaluate knowledge conflicts in large language models, addressing a critical source of hallucinations by analyzing conflicts in retrieved and encoded knowledge across multiple models.
Contribution
This paper introduces ConflictBank, the first large-scale benchmark for systematically assessing knowledge conflicts in LLMs from various aspects and sources.
Findings
Identified key conflict types and causes in LLMs
Analyzed conflict patterns across different model scales
Created over 7 million claim-evidence pairs for evaluation
Abstract
Large language models (LLMs) have achieved impressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts, a major source of hallucinations, has rarely been studied. Only a few research explored the conflicts between the inherent knowledge of LLMs and the retrieved contextual knowledge. However, a thorough assessment of knowledge conflict in LLMs is still missing. Motivated by this research gap, we present ConflictBank, the first comprehensive benchmark developed to systematically evaluate knowledge conflicts from three aspects: (i) conflicts encountered in retrieved knowledge, (ii) conflicts within the models' encoded knowledge, and (iii) the interplay between these conflict forms. Our investigation delves into four model families and twelve LLM instances, meticulously analyzing conflicts stemming from misinformation, temporal discrepancies, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law
