HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images

Yilin Yang; Zhenghui Guo; Yuke Wang; Omprakash Gnawali; Sheng Di; Chengming Zhang

arXiv:2602.10425·cs.CV·February 12, 2026

HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images

Yilin Yang, Zhenghui Guo, Yuke Wang, Omprakash Gnawali, Sheng Di, Chengming Zhang

PDF

Open Access

TL;DR

This paper introduces a novel pipeline to synthesize hallucination-inducing images, revealing pattern vulnerabilities in vision-language models, and proposes mitigation strategies that significantly reduce hallucinations while maintaining model performance.

Contribution

The work presents a new method for generating hallucination-inducing images, a benchmark for evaluating hallucination susceptibility, and a fine-tuning approach that effectively mitigates hallucinations in vision-language models.

Findings

01

Achieves up to 38% reduction in hallucinations on benchmarks.

02

Reveals consistent scene-conditioned hallucination patterns.

03

Provides high-quality datasets for model alignment.

Abstract

Large Vision-Language Models (VLMs) have achieved remarkable success across diverse multimodal tasks but remain vulnerable to hallucinations rooted in inherent language bias. Despite recent progress, existing hallucination mitigation methods often overlook the underlying hallucination patterns driven by language bias. In this work, we design a novel pipeline to accurately synthesize Hallucination-Inducing Images (HIIs). Using synthesized HIIs, we reveal a consistent scene-conditioned hallucination pattern: models tend to mention objects that are highly typical of the scene even when visual evidence is removed. To quantify the susceptibility of VLMs to this hallucination pattern, we establish the Masked-Object-Hallucination (MOH) benchmark to rigorously evaluate existing state-of-the-art alignment frameworks. Finally, we leverage HIIs to construct high-quality preference datasets for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis