The Gray Zone of Faithfulness: Taming Ambiguity in Unfaithfulness Detection
Qiang Ding, Lvzhou Luo, Yixuan Cao, Ping Luo

TL;DR
This paper introduces a new framework and benchmark for detecting unfaithfulness in LLM-generated summaries, addressing annotation ambiguity by including an intermediate category and revealing significant hallucination issues.
Contribution
We propose an annotation framework with an Out-Dependent category and create VeriGray, a challenging benchmark for unfaithfulness detection in summarization tasks.
Findings
SOTA LLMs hallucinate about 6% of sentences in summaries.
Approximately 9% of generated sentences require external knowledge for verification.
Our benchmark challenges existing detection methods, highlighting room for improvement.
Abstract
Ensuring that Large Language Models (LLMs) generate summaries faithful to a given source document is essential for real-world applications. While prior research has explored LLM faithfulness, existing benchmarks suffer from annotation ambiguity, primarily due to the ill-defined boundary of permissible external knowledge in generated outputs. For instance, common sense is often incorporated into responses and labeled as "faithful", yet the acceptable extent of such knowledge remains unspecified, leading to inconsistent annotations. To address this issue, we propose a novel faithfulness annotation framework, which introduces an intermediate category, Out-Dependent, to classify cases where external knowledge is required for verification. Using this framework, we construct VeriGray (Verification with the Gray Zone) -- a new unfaithfulness detection benchmark in summarization. Statistics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
