Lightweight Robust Direct Preference Optimization

Cheol Woo Kim; Shresth Verma; Mauricio Tec; Milind Tambe

arXiv:2510.23590·cs.LG·October 28, 2025

Lightweight Robust Direct Preference Optimization

Cheol Woo Kim, Shresth Verma, Mauricio Tec, Milind Tambe

PDF

TL;DR

This paper introduces DPO-PRO, a lightweight, robust fine-tuning method for large language models that enhances resistance to noisy preference data by focusing on preference uncertainty with minimal computational cost.

Contribution

The paper proposes DPO-PRO, a novel preference robustness approach that improves DPO's robustness to noise without high computational overhead, by focusing on preference uncertainty.

Findings

01

DPO-PRO outperforms existing DPO variants on standard benchmarks.

02

It effectively reduces overfitting caused by noisy preference signals.

03

The method incurs negligible additional computational cost.

Abstract

Direct Preference Optimization (DPO) has become a popular method for fine-tuning large language models (LLMs) due to its stability and simplicity. However, it is also known to be sensitive to noise in the data and prone to overfitting. Recent works have proposed using distributionally robust optimization (DRO) to address potential noise and distributional shift in the data. However, these methods often suffer from excessive conservatism and high computational cost. We propose DPO-PRO (DPO with Preference Robustness), a robust fine-tuning algorithm based on DPO which accounts for uncertainty in the preference distribution through a lightweight DRO formulation. Unlike prior DRO-based variants, DPO-PRO focuses solely on uncertainty in preferences, avoiding unnecessary conservatism and incurring negligible computational overhead. We further show that DPO-PRO is equivalent to a regularized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.