Best-of-Tails: Bridging Optimism and Pessimism in Inference-Time Alignment

Hsiang Hsu; Eric Lei; and Chun-Fu Chen

arXiv:2603.06797·cs.AI·March 10, 2026

Best-of-Tails: Bridging Optimism and Pessimism in Inference-Time Alignment

Hsiang Hsu, Eric Lei, and Chun-Fu Chen

PDF

Open Access

TL;DR

This paper introduces Best-of-Tails (BoT), an adaptive inference-time alignment method for large language models that balances optimism and pessimism based on reward distribution tail behavior, improving alignment performance.

Contribution

The paper formalizes the optimism-pessimism trade-off in inference-time alignment, and proposes BoT, which dynamically adjusts strategies using reward tail estimation for better model alignment.

Findings

01

BoT outperforms fixed strategies in alignment tasks.

02

Reward tail heaviness influences optimal inference strategy.

03

BoT adapts to different reward distributions effectively.

Abstract

Inference-time alignment effectively steers large language models (LLMs) by generating multiple candidates from a reference model and selecting among them with an imperfect reward model. However, current strategies face a fundamental dilemma: ``optimistic'' approaches like Best-of- $N$ suffer from reward hacking, while ``pessimistic'' regularized methods often stifle the exploration needed to discover high-quality responses. In this work, we formalize this trade-off through the lens of regret minimization, demonstrating that the optimal strategy depends critically on the tail behavior of the reward distribution. We show theoretically that light-tailed regimes favor optimism to unearth high-quality outliers, whereas heavy-tailed regimes require pessimism to guard against reward mis-calibration in the extremes. Guided by this insight, we introduce Best-of-Tails (BoT), an adaptive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications