Averaging log-likelihoods in direct alignment

Nathan Grinsztajn; Yannis Flet-Berliac; Mohammad Gheshlaghi Azar,; Florian Strub; Bill Wu; Eugene Choi; Chris Cremer; Arash Ahmadian; Yash; Chandak; Olivier Pietquin; Matthieu Geist

arXiv:2406.19188·cs.LG·June 28, 2024

Averaging log-likelihoods in direct alignment

Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar,, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash, Chandak, Olivier Pietquin, Matthieu Geist

PDF

Open Access

TL;DR

This paper proposes a length-invariant averaging method for direct alignment of Large Language Models, improving alignment with human preferences by addressing length bias in log-likelihood comparisons.

Contribution

It introduces a novel averaging operator for length-invariance in direct alignment, bridging contrastive and supervised training approaches.

Findings

01

Averaging log-likelihoods affects generation scores based on length.

02

The method reveals a trade-off between generation length and quality.

03

Empirical results demonstrate improved alignment consistency.

Abstract

To better align Large Language Models (LLMs) with human judgment, Reinforcement Learning from Human Feedback (RLHF) learns a reward model and then optimizes it using regularized RL. Recently, direct alignment methods were introduced to learn such a fine-tuned model directly from a preference dataset without computing a proxy reward function. These methods are built upon contrastive losses involving the log-likelihood of (dis)preferred completions according to the trained model. However, completions have various lengths, and the log-likelihood is not length-invariant. On the other side, the cross-entropy loss used in supervised training is length-invariant, as batches are typically averaged token-wise. To reconcile these approaches, we introduce a principled approach for making direct alignment length-invariant. Formally, we introduce a new averaging operator, to be composed with the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · Gene expression and cancer classification

MethodsALIGN