FaithSCAN: Model-Driven Single-Pass Hallucination Detection for Faithful Visual Question Answering

Chaodong Tong; Qi Zhang; Chen Li; Lei Jiang; Yanbing Liu

arXiv:2601.00269·cs.CV·February 3, 2026

FaithSCAN: Model-Driven Single-Pass Hallucination Detection for Faithful Visual Question Answering

Chaodong Tong, Qi Zhang, Chen Li, Lei Jiang, Yanbing Liu

PDF

Open Access

TL;DR

FaithSCAN is a lightweight, model-driven approach that detects hallucinations in visual question answering by leveraging internal signals of vision-language models, improving accuracy and efficiency without external resources.

Contribution

The paper introduces FaithSCAN, a novel internal-signal-based hallucination detection method for VQA that outperforms existing external and uncertainty-driven approaches.

Findings

01

FaithSCAN significantly improves detection accuracy over existing methods.

02

Internal signals reveal systematic causes of hallucinations in VQA models.

03

Hallucination patterns differ across VLM architectures, providing new insights.

Abstract

Faithfulness hallucinations in VQA occur when vision-language models produce fluent yet visually ungrounded answers, severely undermining their reliability in safety-critical applications. Existing detection methods mainly fall into two categories: external verification approaches relying on auxiliary models or knowledge bases, and uncertainty-driven approaches using repeated sampling or uncertainty estimates. The former suffer from high computational overhead and are limited by external resource quality, while the latter capture only limited facets of model uncertainty and fail to sufficiently explore the rich internal signals associated with the diverse failure modes. Both paradigms thus have inherent limitations in efficiency, robustness, and detection performance. To address these challenges, we propose FaithSCAN: a lightweight network that detects hallucinations by exploiting rich…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning