Evaluation of Best-of-N Sampling Strategies for Language Model Alignment

Yuki Ichihara; Yuu Jinnai; Tetsuro Morimura; Kaito Ariu; Kenshi Abe,; Mitsuki Sakamoto; Eiji Uchibe

arXiv:2502.12668·cs.CL·February 19, 2025

Evaluation of Best-of-N Sampling Strategies for Language Model Alignment

Yuki Ichihara, Yuu Jinnai, Tetsuro Morimura, Kaito Ariu, Kenshi Abe,, Mitsuki Sakamoto, Eiji Uchibe

PDF

Open Access

TL;DR

This paper analyzes and extends Best-of-N sampling strategies for language model alignment, introducing Stochastic RBoN with theoretical guarantees and evaluating regularization effects on true reward proxies.

Contribution

It proposes Stochastic RBoN, a theoretically grounded extension of RBoN, and evaluates various regularization strategies to improve language model alignment with human preferences.

Findings

01

Regularization strategies improve true reward proxy performance.

02

Sentence Length Regularized BoN outperforms previous methods.

03

Stochastic RBoN offers theoretical guarantees for worst-case optimization.

Abstract

Best-of-N (BoN) sampling with a reward model has been shown to be an effective strategy for aligning Large Language Models (LLMs) with human preferences at the time of decoding. BoN sampling is susceptible to a problem known as reward hacking. Since the reward model is an imperfect proxy for the true objective, an excessive focus on optimizing its value can lead to a compromise of its performance on the true objective. Previous work proposes Regularized BoN sampling (RBoN), a BoN sampling with regularization to the objective, and shows that it outperforms BoN sampling so that it mitigates reward hacking and empirically (Jinnai et al., 2024). However, Jinnai et al. (2024) introduce RBoN based on a heuristic and they lack the analysis of why such regularization strategy improves the performance of BoN sampling. The aim of this study is to analyze the effect of BoN sampling on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis