Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
Zhiyuan Zhao, Bin Wang, Linke Ouyang, Xiaoyi Dong, Jiaqi Wang, Conghui, He

TL;DR
This paper presents HA-DPO, a novel training method that reduces hallucinations in multimodal large language models by teaching them to prefer accurate responses over fabricated ones, significantly improving their reliability.
Contribution
The paper introduces Hallucination-Aware Direct Preference Optimization (HA-DPO), a new training approach that effectively minimizes hallucinations in multimodal models through preference learning.
Findings
HA-DPO significantly reduces hallucination issues in tested models.
Applying HA-DPO to MiniGPT-4 improved POPE accuracy from 51.13% to 86.13%.
The approach enhances models' generalization capabilities.
Abstract
Multimodal large language models have made significant advancements in recent years, yet they still suffer from a common issue known as the "hallucination problem", in which the models generate textual descriptions that inaccurately depict or entirely fabricate content from associated images. This paper introduces a novel solution, Hallucination-Aware Direct Preference Optimization (HA-DPO), which reframes the hallucination problem as a preference selection task. The model is trained to favor the non-hallucinating response when presented with two responses of the same image (one accurate and one hallucinatory). Furthermore, this paper proposes an efficient pipeline for constructing positive~(non-hallucinatory) and negative~(hallucinatory) sample pairs, ensuring a high-quality, style-consistent dataset for robust preference learning. When applied to three mainstream multimodal models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Advanced Text Analysis Techniques
