Many of Your DPOs are Secretly One: Attempting Unification Through   Mutual Information

Rasul Tutnov; Antoine Grosnit; Haitham Bou-Ammar

arXiv:2501.01544·cs.LG·January 6, 2025

Many of Your DPOs are Secretly One: Attempting Unification Through Mutual Information

Rasul Tutnov, Antoine Grosnit, Haitham Bou-Ammar

PDF

Open Access

TL;DR

This paper introduces a mutual information-based unifying framework for preference optimization in large language models, simplifying understanding of various DPO variants and aiding future alignment research.

Contribution

The paper proposes a new flexible loss function framework that unifies many existing DPO algorithms through specified priors, enhancing interpretability and development of LLM alignment methods.

Findings

01

Many DPO variants can be derived from the proposed framework

02

The framework clarifies relationships between different alignment algorithms

03

Potential for developing more robust and interpretable alignment techniques

Abstract

Post-alignment of large language models (LLMs) is critical in improving their utility, safety, and alignment with human intentions. Direct preference optimisation (DPO) has become one of the most widely used algorithms for achieving this alignment, given its ability to optimise models based on human feedback directly. However, the vast number of DPO variants in the literature has made it increasingly difficult for researchers to navigate and fully grasp the connections between these approaches. This paper introduces a unifying framework inspired by mutual information, which proposes a new loss function with flexible priors. By carefully specifying these priors, we demonstrate that many existing algorithms, such as SimPO, TDPO, SparsePO, and others, can be derived from our framework. This unification offers a clearer and more structured approach, allowing researchers to understand the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConstraint Satisfaction and Optimization · Multimodal Machine Learning Applications · Machine Learning and Data Classification

MethodsDirect Preference Optimization