Reducing Hallucination in Vision-Language Models via Stage-wise Preference Optimization under Distribution Shift
Qinwu Xu

TL;DR
This paper introduces a stage-wise preference optimization framework to reduce hallucinations in vision-language models by constructing targeted multimodal data and preference pairs, improving grounding and consistency.
Contribution
The proposed approach progressively constructs hallucination-focused preference pairs near failure boundaries, enhancing multimodal grounding and reducing hallucinations in VLMs.
Findings
Improved grounding consistency and reduced hallucination in benchmarks.
More visually grounded responses than proprietary VLMs in ambiguous scenarios.
Hallucination arises from autoregressive tendencies, not just model capacity.
Abstract
Hallucination remains a fundamental challenge in vision-language models (VLMs), where autoregressive generation may produce linguistically plausible yet physically inconsistent or visually ungrounded responses due to likelihood maximization under joint probabilistic modeling. We propose a stage-wise preference optimization framework for hallucination reduction through targeted multimodal data construction. Rather than directly optimizing on generic instruction-following data, our approach progressively constructs hallucination-focused preference pairs near known failure boundaries. The framework emphasizes ambiguous spatial orientation, object relationships, OCR uncertainty, and adversarial false-premise training. Hallucinated negatives are generated through minimally perturbed yet visually inconsistent alternatives, enabling Direct Preference Optimization (DPO) to better separate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
