A Geometric Taxonomy of Hallucinations in LLMs

Javier Mar\'in

arXiv:2602.13224·cs.AI·May 11, 2026

A Geometric Taxonomy of Hallucinations in LLMs

Javier Mar\'in

PDF

TL;DR

This paper introduces a geometric taxonomy for detecting hallucinations in language models using embedding space analysis, providing interpretable methods suited for black-box deployment scenarios.

Contribution

It develops a geometry-based framework predicting which hallucination types are detectable, validated through a new human-confabulated dataset across multiple domains.

Findings

01

Query-proximate unfaithfulness detectable by angular ratio

02

Directional signatures outperform NLI in confabulation detection

03

Some factual errors are indistinguishable by angular geometry

Abstract

Hallucinations in deployed language models can have real consequences for downstream decisions in domains such as healthcare, legal, and financial services. In production, detection has to run on what the deployed system can see: the query, the response, and often a source document. White-box access to model internals and multi-sample querying are not generally available behind a third-party API. Within this setting - black-box, single-pass, only question/answer available - the dominant baseline is NLI, which returns a value but no diagnosis when it fails. We argue that operating directly on the geometry of the embedding space provides detection methods whose successes and failures are interpretable as structural properties of contrastive sentence-encoder training \citep{wang2020understanding}. The contribution is: given an operationally-motivated taxonomy, geometry predicts which types…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.