A Unified Definition of Hallucination: It's The World Model, Stupid!
Emmy Liu, Varun Gangal, Chelsea Zou, Michael Yu, Xiaoqi Huang, Alex Chang, Zhuofu Tao, Karan Singh, Sachin Kumar, Steven Y. Feng

TL;DR
This paper proposes a unified definition of hallucination in language models as inaccurate world modeling, aiming to clarify evaluation standards and facilitate the development of benchmarks to improve model reliability.
Contribution
It introduces a comprehensive framework unifying prior hallucination definitions based on world model inaccuracies, enabling clearer evaluation and comparison across benchmarks.
Findings
Unified hallucination definition based on world model inaccuracies
Framework distinguishes true hallucinations from other errors
Plans for benchmarks using synthetic reference models
Abstract
Despite numerous attempts at mitigation since the inception of language models, hallucinations remain a persistent problem even in today's frontier LLMs. Why is this? We review existing definitions of hallucination and fold them into a single, unified definition wherein prior definitions are subsumed. We argue that hallucination can be unified by defining it as simply inaccurate (internal) world modeling, in a form where it is observable to the user. For example, stating a fact which contradicts a knowledge base OR producing a summary which contradicts the source. By varying the reference world model and conflict policy, our framework unifies prior definitions. We argue that this unified view is useful because it forces evaluations to clarify their assumed reference "world", distinguishes true hallucinations from planning or reward errors, and provides a common language for comparison…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSchizophrenia research and treatment · Mental Health via Writing · Mental Health and Psychiatry
