The Impact of Negated Text on Hallucination with Large Language Models
Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim

TL;DR
This paper investigates how negated text influences hallucination detection in large language models, revealing that LLMs struggle with negation and often produce inconsistent judgments, highlighting a significant challenge in NLP.
Contribution
The study introduces the NegHalu dataset for negated hallucination detection and analyzes LLMs' difficulties in recognizing hallucinations in negated contexts.
Findings
LLMs often fail to detect hallucinations in negated text
Negated inputs cause LLMs to produce inconsistent judgments
Internal analysis shows challenges in processing negation at token level
Abstract
Recent studies on hallucination in large language models (LLMs) have been actively progressing in natural language processing. However, the impact of negated text on hallucination with LLMs remains largely unexplored. In this paper, we set three important yet unanswered research questions and aim to address them. To derive the answers, we investigate whether LLMs can recognize contextual shifts caused by negation and still reliably distinguish hallucinations comparable to affirmative cases. We also design the NegHalu dataset by reconstructing existing hallucination detection datasets with negated expressions. Our experiments demonstrate that LLMs struggle to detect hallucinations in negated text effectively, often producing logically inconsistent or unfaithful judgments. Moreover, we trace the internal state of LLMs as they process negated inputs at the token level and reveal the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMental Health via Writing · Psychedelics and Drug Studies · Misinformation and Its Impacts
