Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback

Kyuyoung Kim; Ah Jeong Seo; Hao Liu; Jinwoo Shin; Kimin Lee

arXiv:2410.03145·cs.CL·July 1, 2025

Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback

Kyuyoung Kim, Ah Jeong Seo, Hao Liu, Jinwoo Shin, Kimin Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces Margin Matching Preference Optimization (MMPO), a novel method that incorporates relative quality margins into LLM fine-tuning, resulting in improved performance and robustness over traditional binary preference methods.

Contribution

The paper proposes MMPO, a new preference optimization approach that uses quality margins and the Bradley-Terry model to enhance LLM alignment with granular feedback.

Findings

01

MMPO outperforms baseline methods on MT-bench and RewardBench.

02

The 7B model trained with MMPO achieves state-of-the-art results on RewardBench.

03

MMPO produces more robust and better-calibrated models.

Abstract

Large language models (LLMs) fine-tuned with alignment techniques, such as reinforcement learning from human feedback, have been instrumental in developing some of the most capable AI systems to date. Despite their success, existing methods typically rely on simple binary labels, such as those indicating preferred outputs in pairwise preferences, which fail to capture the subtle differences in relative quality between pairs. To address this limitation, we introduce an approach called Margin Matching Preference Optimization (MMPO), which incorporates relative quality margins into optimization, leading to improved LLM policies and reward models. Specifically, given quality margins in pairwise preferences, we design soft target probabilities based on the Bradley-Terry model, which are then used to train models with the standard cross-entropy objective. Experiments with both human and AI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kykim0/margin-matching-pref-opt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms