Scalable Best-of-N Selection for Large Language Models via Self-Certainty

Zhewei Kang; Xuandong Zhao; Dawn Song

arXiv:2502.18581·cs.CL·December 15, 2025

Scalable Best-of-N Selection for Large Language Models via Self-Certainty

Zhewei Kang, Xuandong Zhao, Dawn Song

PDF

Open Access 1 Repo

TL;DR

This paper introduces self-certainty, an efficient metric based on LLM output probabilities, to improve best-of-N selection for reasoning tasks without external reward models, scaling well and enhancing performance.

Contribution

It proposes self-certainty as a novel, reward-free response evaluation method that scales with sample size and improves reasoning in large language models.

Findings

01

Self-certainty correlates with response accuracy.

02

It scales effectively with increasing sample size N.

03

It enhances reasoning performance beyond greedy decoding.

Abstract

Best-of-N selection is a key technique for improving the reasoning performance of Large Language Models (LLMs) through increased test-time computation. Current state-of-the-art methods often employ computationally intensive reward models for response evaluation and selection. Reward-free alternatives, like self-consistency and universal self-consistency, are limited in their ability to handle open-ended generation tasks or scale effectively. To address these limitations, we propose self-certainty, a novel and efficient metric that leverages the inherent probability distribution of LLM outputs to estimate response quality without requiring external reward models. We hypothesize that higher distributional self-certainty, aggregated across multiple samples, correlates with improved response accuracy, as it reflects greater confidence in the generated output. Through extensive experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

backprop07/self-certainty
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques