Vision-Language Introspection: Mitigating Overconfident Hallucinations in MLLMs via Interpretable Bi-Causal Steering

Shuliang Liu; Songbo Yang; Dong Fang; Sihang Jia; Yuqi Tang; Lingfeng Su; Ruoshui Peng; Yibo Yan; Xin Zou; Xuming Hu

arXiv:2601.05159·cs.CV·January 9, 2026

Vision-Language Introspection: Mitigating Overconfident Hallucinations in MLLMs via Interpretable Bi-Causal Steering

Shuliang Liu, Songbo Yang, Dong Fang, Sihang Jia, Yuqi Tang, Lingfeng Su, Ruoshui Peng, Yibo Yan, Xin Zou, Xuming Hu

PDF

Open Access

TL;DR

This paper introduces Vision-Language Introspection (VLI), a training-free framework that reduces hallucinations in multimodal models by diagnosing and actively correcting visual misinterpretations through interpretable, instance-specific steering.

Contribution

VLI presents a novel, training-free inference method that diagnoses hallucination risks and dynamically corrects visual evidence interpretation in multimodal models.

Findings

01

Reduces object hallucination rates by 12.67% on MMHal-Bench

02

Improves accuracy by 5.8% on POPE

03

Achieves state-of-the-art performance on advanced models

Abstract

Object hallucination critically undermines the reliability of Multimodal Large Language Models, often stemming from a fundamental failure in cognitive introspection, where models blindly trust linguistic priors over specific visual evidence. Existing mitigations remain limited: contrastive decoding approaches operate superficially without rectifying internal semantic misalignments, while current latent steering methods rely on static vectors that lack instance-specific precision. We introduce Vision-Language Introspection (VLI), a training-free inference framework that simulates a metacognitive self-correction process. VLI first performs Attributive Introspection to diagnose hallucination risks via probabilistic conflict detection and localize the causal visual anchors. It then employs Interpretable Bi-Causal Steering to actively modulate the inference process, dynamically isolating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis