PAS : Prelim Attention Score for Detecting Object Hallucinations in Large Vision--Language Models

Nhat Hoang-Xuan; Minh Vu; My T. Thai; Manish Bhattarai

arXiv:2511.11502·cs.CV·November 17, 2025

PAS : Prelim Attention Score for Detecting Object Hallucinations in Large Vision--Language Models

Nhat Hoang-Xuan, Minh Vu, My T. Thai, Manish Bhattarai

PDF

Open Access

TL;DR

This paper introduces PAS, a lightweight, training-free score derived from attention weights to detect object hallucinations in large vision-language models, significantly improving real-time filtering and reliability.

Contribution

We propose PAS, a novel attention-based metric that effectively identifies hallucinations without additional training, enhancing model reliability during inference.

Findings

01

PAS outperforms existing hallucination detection methods.

02

PAS enables real-time filtering of hallucinations.

03

Weak image dependence correlates with hallucination likelihood.

Abstract

Large vision-language models (LVLMs) are powerful, yet they remain unreliable due to object hallucinations. In this work, we show that in many hallucinatory predictions the LVLM effectively ignores the image and instead relies on previously generated output (prelim) tokens to infer new objects. We quantify this behavior via the mutual information between the image and the predicted object conditioned on the prelim, demonstrating that weak image dependence strongly correlates with hallucination. Building on this finding, we introduce the Prelim Attention Score (PAS), a lightweight, training-free signal computed from attention weights over prelim tokens. PAS requires no additional forward passes and can be computed on the fly during inference. Exploiting this previously overlooked signal, PAS achieves state-of-the-art object-hallucination detection across multiple models and datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Face Recognition and Perception