Causally-Grounded Dual-Path Attention Intervention for Object Hallucination Mitigation in LVLMs

Liu Yu; Zhonghao Chen; Ping Kuang; Zhikun Feng; Fan Zhou; Lan Wang; Gillian Dobbie

arXiv:2511.09018·cs.CV·November 13, 2025

Causally-Grounded Dual-Path Attention Intervention for Object Hallucination Mitigation in LVLMs

Liu Yu, Zhonghao Chen, Ping Kuang, Zhikun Feng, Fan Zhou, Lan Wang, Gillian Dobbie

PDF

Open Access 1 Video

TL;DR

This paper introduces Owl, a causally-grounded framework for reducing object hallucinations in LVLMs by modeling attention interactions, quantifying modality contributions, and dynamically intervening during decoding, leading to state-of-the-art results.

Contribution

The paper proposes a novel causally-grounded attention intervention framework with a new metric VTACR and a dual-path decoding strategy to effectively mitigate hallucinations in LVLMs.

Findings

01

Owl significantly reduces hallucinations on POPE and CHAIR benchmarks.

02

VTACR correlates with hallucination likelihood, guiding interventions.

03

Dual-path contrastive decoding improves faithfulness without sacrificing understanding.

Abstract

Object hallucination remains a critical challenge in Large Vision-Language Models (LVLMs), where models generate content inconsistent with visual inputs. Existing language-decoder based mitigation approaches often regulate visual or textual attention independently, overlooking their interaction as two key causal factors. To address this, we propose Owl (Bi-mOdal attention reWeighting for Layer-wise hallucination mitigation), a causally-grounded framework that models hallucination process via a structural causal graph, treating decomposed visual and textual attentions as mediators. We introduce VTACR (Visual-to-Textual Attention Contribution Ratio), a novel metric that quantifies the modality contribution imbalance during decoding. Our analysis reveals that hallucinations frequently occur in low-VTACR scenarios, where textual priors dominate and visual grounding is weakened. To mitigate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Causally-Grounded Dual-Path Attention Intervention for Object Hallucination Mitigation in LVLMs· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis