Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models

Rei Higuchi; Taiji Suzuki

arXiv:2505.07558·cs.LG·May 21, 2025

Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models

Rei Higuchi, Taiji Suzuki

PDF

Open Access 1 Video

TL;DR

This paper introduces Direct Density Ratio Optimization (DDRO), a novel, statistically consistent method for aligning large language models with human preferences by directly estimating preference ratios, improving reliability and performance.

Contribution

The paper proposes DDRO, a new alignment technique that avoids preference model assumptions and guarantees convergence to true preferences as data increases.

Findings

01

DDRO outperforms existing methods on major benchmarks.

02

DDRO is statistically consistent regardless of preference structure.

03

The approach enables more reliable, data-driven LLM alignment.

Abstract

Aligning large language models (LLMs) with human preferences is crucial for safe deployment, yet existing methods assume specific preference models like Bradley-Terry model. This assumption leads to statistical inconsistency, where more data doesn't guarantee convergence to true human preferences. To address this critical gap, we introduce a novel alignment method Direct Density Ratio Optimization (DDRO). DDRO directly estimates the density ratio between preferred and unpreferred output distributions, circumventing the need for explicit human preference modeling. We theoretically prove that DDRO is statistically consistent, ensuring convergence to the true preferred distribution as the data size grows, regardless of the underlying preference structure. Experiments demonstrate that DDRO achieves superior performance compared to existing methods on many major benchmarks. DDRO unlocks the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Topic Modeling · Gaussian Processes and Bayesian Inference