Benchmarking Direct Preference Optimization for Medical Large Vision-Language Models
Dain Kim, Jiwoo Lee, Jaehoon Yun, Yong Hoe Koo, Qingyu Chen, Hyunjae Kim, Jaewoo Kang

TL;DR
This paper evaluates various Direct Preference Optimization (DPO) methods for medical vision-language models, identifying limitations and proposing a targeted strategy that improves visual question-answering accuracy by 3.6%.
Contribution
It provides the first comprehensive empirical analysis of DPO variants in medical LVLMs and introduces a new preference construction method to address visual misinterpretation errors.
Findings
DPO approaches show inconsistent improvements over supervised fine-tuning.
Current DPO methods often fail to fix visual misinterpretation errors.
A targeted preference strategy improves visual QA performance by 3.6%.
Abstract
Large Vision-Language Models (LVLMs) hold significant promise for medical applications, yet their deployment is often constrained by insufficient alignment and reliability. While Direct Preference Optimization (DPO) has emerged as a potent framework for refining model responses, its efficacy in high-stakes medical contexts remains underexplored, lacking the rigorous empirical groundwork necessary to guide future methodological advances. To bridge this gap, we present the first comprehensive examination of diverse DPO variants within the medical domain, evaluating nine distinct formulations across two medical LVLMs: LLaVA-Med and HuatuoGPT-Vision. Our results reveal several critical limitations: current DPO approaches often yield inconsistent gains over supervised fine-tuning, with their efficacy varying significantly across different tasks and backbones. Furthermore, they frequently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Explainable Artificial Intelligence (XAI)
