Hallucination as Commitment Failure: Larger LLMs Misfire Despite Knowing the Answer
Jewon Yeom, Jaewon Sok, Heejun Kim, Seonghyeon Park, Jeongjae Park, Taesup Kim

TL;DR
This paper investigates why larger language models hallucinate despite knowing the correct answers, revealing that increased scale and instruction tuning lead models to commit more confidently to answers, causing hallucinations even when the correct concept is available.
Contribution
It introduces a semantic notion of answer availability and shows that larger models tend to sharpen answer commitment, which explains hallucinations beyond mere knowledge absence.
Findings
16-47% of hallucinations occur with the correct concept already present.
Larger models show increased rate of hallucination with scale.
Correct generations concentrate probability on a single form, hallucinations disperse it.
Abstract
Hallucination is often viewed as a direct consequence of missing knowledge: a model answers incorrectly when the correct answer is absent from its generation-time distribution, and correctly when it is present. We test this assumption by introducing a semantic notion of answer availability that aggregates token-level variants expressing the same answer concept, and asks whether the correct concept is already available at the moment the model commits to an answer. Across Qwen and Llama models from 0.8B to 72B in both Instruct and Base variants, 16-47% of Instruct hallucinations occur with substantial probability mass already on the correct concept, and the rate rises monotonically with scale. Comparing such failures against correct generations with matched semantic support, the distinguishing factor is not whether the correct concept is represented, but how its probability is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
