Sample-Efficient Alignment for LLMs

Zichen Liu; Changyu Chen; Chao Du; Wee Sun Lee; Min Lin

arXiv:2411.01493·cs.LG·November 12, 2024

Sample-Efficient Alignment for LLMs

Zichen Liu, Changyu Chen, Chao Du, Wee Sun Lee, Min Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces SEA, a sample-efficient algorithm for aligning large language models with human preferences, using bandit theory and active exploration, validated across multiple models and preference methods.

Contribution

The paper presents a novel bandit-based algorithm, SEA, for efficient LLM alignment, and provides extensive empirical validation and open-source implementation.

Findings

01

SEA outperforms recent active exploration methods in sample efficiency.

02

The algorithm is effective across multiple model scales and preference learning algorithms.

03

Extensive experiments validate the practical utility of SEA for online LLM alignment.

Abstract

We study methods for efficiently aligning large language models (LLMs) with human preferences given budgeted online feedback. We first formulate the LLM alignment problem in the frame of contextual dueling bandits. This formulation, subsuming recent paradigms such as online RLHF and online DPO, inherently quests for sample-efficient algorithms that incorporate online active exploration. Leveraging insights from bandit theory, we introduce a unified algorithm based on Thompson sampling and highlight its applications in two distinct LLM alignment scenarios. The practical agent that efficiently implements this algorithm, named SEA (Sample-Efficient Alignment), is empirically validated through extensive experiments across three model scales (1B, 2.8B, 6.9B) and three preference learning algorithms (DPO, IPO, SLiC). The results demonstrate that SEA achieves highly sample-efficient alignment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sail-sg/oat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsDirect Preference Optimization