HACK: Hallucinations Along Certainty and Knowledge Axes

Adi Simhi; Jonathan Herzig; Itay Itzhak; Dana Arad; Zorik Gekhman; Roi Reichart; Fazl Barez; Gabriel Stanovsky; Idan Szpektor; Yonatan Belinkov

arXiv:2510.24222·cs.CL·October 29, 2025

HACK: Hallucinations Along Certainty and Knowledge Axes

Adi Simhi, Jonathan Herzig, Itay Itzhak, Dana Arad, Zorik Gekhman, Roi Reichart, Fazl Barez, Gabriel Stanovsky, Idan Szpektor, Yonatan Belinkov

PDF

TL;DR

This paper introduces a framework for categorizing hallucinations in large language models along knowledge and certainty axes, enabling targeted mitigation strategies and revealing distinct hallucination patterns across models.

Contribution

The paper proposes a novel two-axis framework for understanding hallucinations in LLMs, including a model-specific dataset construction method and evaluation metrics for mitigation effectiveness.

Findings

01

Differentiates hallucinations caused by lack of knowledge from those occurring despite knowledge.

02

Shows that mitigation methods vary in effectiveness across hallucination types.

03

Identifies a subset of hallucinations where models are certain despite having correct knowledge.

Abstract

Hallucinations in LLMs present a critical barrier to their reliable usage. Existing research usually categorizes hallucination by their external properties rather than by the LLMs' underlying internal properties. This external focus overlooks that hallucinations may require tailored mitigation strategies based on their underlying mechanism. We propose a framework for categorizing hallucinations along two axes: knowledge and certainty. Since parametric knowledge and certainty may vary across models, our categorization method involves a model-specific dataset construction process that differentiates between those types of hallucinations. Along the knowledge axis, we distinguish between hallucinations caused by a lack of knowledge and those occurring despite the model having the knowledge of the correct response. To validate our framework along the knowledge axis, we apply steering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.