Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration
Younan Zhu, Linwei Tao, Minjing Dong, Chang Xu

TL;DR
This paper introduces Dynamic Attention Calibration (DAC), a novel, adaptable method that reduces object hallucinations in large vision-language models by calibrating attention maps to be position-invariant, improving factual alignment.
Contribution
The paper proposes a dynamic, contrastive learning-based attention calibration module that generalizes across models and inputs, effectively mitigating hallucinations in LVLMs.
Findings
DAC significantly reduces object hallucinations.
The method improves multimodal alignment across benchmarks.
DAC outperforms static attention reordering methods.
Abstract
Large Vision-Language Models (LVLMs) exhibit impressive multimodal reasoning capabilities but remain highly susceptible to object hallucination, where models generate responses that are not factually aligned with the visual content. Recent works attribute this issue to an inherent bias of LVLMs where the vision token attention map has spurious focus on certain positions, and propose to mitigate this issue by reordering visual tokens. However, we find that different LVLMs exhibit different correlations between attention and spatial position, which makes existing static solutions difficult to generalize to other LVLMs. To begin with, we investigate the attention bias introduced by image tokens through a toy experiment, in which a blank image is fed into the model to capture its position-dependent bias. We then remove this bias from the original attention map, which already leads to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Functional Brain Connectivity Studies · COVID-19 diagnosis using AI
MethodsSoftmax · Attention Is All You Need · Dynamic Algorithm Configuration
