Best-of-Tails: Bridging Optimism and Pessimism in Inference-Time Alignment
Hsiang Hsu, Eric Lei, and Chun-Fu Chen

TL;DR
This paper introduces Best-of-Tails (BoT), an adaptive inference-time alignment method for large language models that balances optimism and pessimism based on reward distribution tail behavior, improving alignment performance.
Contribution
The paper formalizes the optimism-pessimism trade-off in inference-time alignment, and proposes BoT, which dynamically adjusts strategies using reward tail estimation for better model alignment.
Findings
BoT outperforms fixed strategies in alignment tasks.
Reward tail heaviness influences optimal inference strategy.
BoT adapts to different reward distributions effectively.
Abstract
Inference-time alignment effectively steers large language models (LLMs) by generating multiple candidates from a reference model and selecting among them with an imperfect reward model. However, current strategies face a fundamental dilemma: ``optimistic'' approaches like Best-of- suffer from reward hacking, while ``pessimistic'' regularized methods often stifle the exploration needed to discover high-quality responses. In this work, we formalize this trade-off through the lens of regret minimization, demonstrating that the optimal strategy depends critically on the tail behavior of the reward distribution. We show theoretically that light-tailed regimes favor optimism to unearth high-quality outliers, whereas heavy-tailed regimes require pessimism to guard against reward mis-calibration in the extremes. Guided by this insight, we introduce Best-of-Tails (BoT), an adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
