Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
Jiaye Qian, Ge Zheng, Yuchen Zhu, Sibei Yang

TL;DR
This paper introduces a unified intervention framework for LVLMs that targets multiple causal pathways to effectively reduce hallucinations across different alignment formats, improving model reliability.
Contribution
It reveals that hallucinations stem from multiple interconnected pathways and introduces a method to identify and intervene on critical heads within these pathways.
Findings
Interventions on specific pathways reduce hallucinations.
Hallucination pathways vary with alignment formats.
Proposed methods improve performance across benchmarks.
Abstract
Despite their impressive performance across a wide range of tasks, Large Vision-Language Models (LVLMs) remain prone to hallucination. In this study, we propose a comprehensive intervention framework aligned with the transformer's causal architecture in LVLMs, integrating the effects of different intervention paths on hallucination. We find that hallucinations in LVLMs do not arise from a single causal path, but rather from the interplay among image-to-input-text, image-to-output-text, and text-to-text pathways. For the first time, we also find that LVLMs rely on different pathways depending on the question-answer alignment format. Building on these insights, we propose simple yet effective methods to identify and intervene on critical hallucination heads within each pathway, tailored to discriminative and generative formats. Experiments across multiple benchmarks demonstrate that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Misinformation and Its Impacts · Adversarial Robustness in Machine Learning
