DA-DPO: Cost-efficient Difficulty-aware Preference Optimization for Reducing MLLM Hallucinations

Longtian Qiu; Shan Ning; Chuyu Zhang; Jiaxuan Sun; Xuming He

arXiv:2601.00623·cs.AI·January 5, 2026

DA-DPO: Cost-efficient Difficulty-aware Preference Optimization for Reducing MLLM Hallucinations

Longtian Qiu, Shan Ning, Chuyu Zhang, Jiaxuan Sun, Xuming He

PDF

Open Access 2 Models

TL;DR

DA-DPO introduces a cost-effective, difficulty-aware preference optimization framework that reduces hallucinations in multimodal large language models by balancing learning focus on challenging examples.

Contribution

It proposes a novel difficulty estimation and reweighting method for preference optimization, improving hallucination mitigation without additional data or fine-tuning.

Findings

01

Enhanced robustness to hallucinations across benchmarks

02

Improved generalization in multimodal preference optimization

03

Maintained computational efficiency

Abstract

Direct Preference Optimization (DPO) has shown strong potential for mitigating hallucinations in Multimodal Large Language Models (MLLMs). However, existing multimodal DPO approaches often suffer from overfitting due to the difficulty imbalance in preference data. Our analysis shows that MLLMs tend to overemphasize easily distinguishable preference pairs, which hinders fine-grained hallucination suppression and degrades overall performance. To address this issue, we propose Difficulty-Aware Direct Preference Optimization (DA-DPO), a cost-effective framework designed to balance the learning process. DA-DPO consists of two main components: (1) Difficulty Estimation leverages pre-trained vision--language models with complementary generative and contrastive objectives, whose outputs are integrated via a distribution-aware voting strategy to produce robust difficulty scores without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Multimodal Machine Learning Applications · Constraint Satisfaction and Optimization