Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models

Lexiang Xiong; Qi Li; Jingwen Ye; Xinchao Wang

arXiv:2603.15557·cs.CV·March 17, 2026

Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models

Lexiang Xiong, Qi Li, Jingwen Ye, Xinchao Wang

PDF

Open Access

TL;DR

This paper introduces a novel diagnostic framework for hallucinations in vision-language models, using information-theoretic probes and a geometric-information duality to identify and interpret errors dynamically, enhancing trustworthiness and transparency.

Contribution

It presents a new dynamic, pathologically grounded approach to diagnosing hallucinations in VLMs, leveraging a low-dimensional cognitive state space and geometric anomaly detection.

Findings

01

Achieves state-of-the-art hallucination detection across diverse tasks

02

Operates efficiently with weak supervision and contaminated data

03

Provides causal attribution to different pathological states

Abstract

Vision-Language Models (VLMs) frequently "hallucinate" - generate plausible yet factually incorrect statements - posing a critical barrier to their trustworthy deployment. In this work, we propose a new paradigm for diagnosing hallucinations, recasting them from static output errors into dynamic pathologies of a model's computational cognition. Our framework is grounded in a normative principle of computational rationality, allowing us to model a VLM's generation as a dynamic cognitive trajectory. We design a suite of information-theoretic probes that project this trajectory onto an interpretable, low-dimensional Cognitive State Space. Our central discovery is a governing principle we term the geometric-information duality: a cognitive trajectory's geometric abnormality within this space is fundamentally equivalent to its high information-theoretic surprisal. Hallucination detection is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices