Dual-Pathway Circuits of Object Hallucination in Vision-Language Models
Jiaxin Liu, Ding Zhong, Yue Wang, Zhidong Yang, Zhaolu Kang, Guangyuan Dong, Qishi Zhan, Pengcheng Fang, Aofan Liu

TL;DR
This paper introduces a framework to understand and reduce object hallucinations in vision-language models by identifying and manipulating specific neural circuits responsible for hallucinations.
Contribution
It presents Dual-Pathway Circuit Analysis, a novel method to identify and causally intervene on hallucination-related circuits in VLMs, improving their reliability.
Findings
Identified visual grounding and hallucination pathways across diverse VLMs.
Scaling hallucination pathways reduces hallucinations by up to 76%.
Hallucination circuits transfer selectively across different hallucination types.
Abstract
Vision-language models (VLMs) have demonstrated remarkable capabilities in bridging visual perception and natural language understanding, enabling a wide range of multimodal reasoning tasks. However, they often produce object hallucinations, describing content absent from the input image, which limits their reliability and interpretability. To address this limitation, we propose Dual-Pathway Circuit Analysis, a framework that identifies and characterizes hallucination-related circuits in VLMs for mechanistic understanding and causal probing. We first apply activation patching across five architecturally diverse VLMs to identify a visual grounding pathway that supports correct predictions and a hallucination pathway that drives erroneous outputs. We then introduce Conditional Pathway Analysis (CPA) to characterize pathway-level interactions, revealing that grounding components remain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
