From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models

Tarun Raheja; Nilay Pochhi

arXiv:2601.06108·cs.AI·January 13, 2026

From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models

Tarun Raheja, Nilay Pochhi

PDF

Open Access

TL;DR

This paper unifies various preference learning methods for aligning large language models through a theoretical framework, clarifying their differences and guiding practitioners in method selection.

Contribution

It provides a formal unification of preference learning approaches, characterizing them along three axes and establishing key theoretical results and failure modes.

Findings

01

Reveals the underlying structure of preference learning methods.

02

Establishes scaling laws and conditions for method failure.

03

Provides a decision guide for practitioners.

Abstract

Aligning large language models (LLMs) with human preferences has become essential for safe and beneficial AI deployment. While Reinforcement Learning from Human Feedback (RLHF) established the dominant paradigm, a proliferation of alternatives -- Direct Preference Optimization (DPO), Identity Preference Optimization (IPO), Kahneman-Tversky Optimization (KTO), Simple Preference Optimization (SimPO), and many others -- has left practitioners without clear guidance on method selection. This survey provides a \textit{theoretical unification} of preference learning methods, revealing that the apparent diversity reduces to principled choices along three orthogonal axes: \textbf{(I) Preference Model} (what likelihood model underlies the objective), \textbf{(II) Regularization Mechanism} (how deviation from reference policies is controlled), and \textbf{(III) Data Distribution} (online vs.\…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI