Beyond Majority Voting: Efficient Best-Of-N with Radial Consensus Score
Manh Nguyen, Sunil Gupta, and Hung Le

TL;DR
The paper introduces Radial Consensus Score (RCS), a geometric, training-free method for selecting the best response from multiple LLM outputs, outperforming traditional voting methods across various benchmarks.
Contribution
RCS models semantic consensus using a weighted Fréchet mean of answer embeddings, supporting multiple weighting schemes and improving answer selection in black-box LLM settings.
Findings
RCS variants outperform strong baselines across seven benchmarks.
Performance improves with increased sampling budget.
RCS is effective in multi-agent debate and black-box scenarios.
Abstract
Large language models (LLMs) frequently generate multiple candidate responses for a given prompt, yet selecting the most reliable one remains challenging, especially when correctness diverges from surface-level majority agreement. Existing approaches, such as self-consistency, rely on discrete voting, while probability-based methods often fail to capture relationships among candidate answers or tend to underweight high-quality but less frequent responses, and do not fully leverage the geometric structure of answer representations. To address these limitations, we introduce Radial Consensus Score (RCS), a simple, efficient, and training-free method for best-of-N selection. RCS models semantic consensus by computing a weighted Fr\'echet mean (semantic center) of answer embeddings and ranking candidates by their radial distance to this center. Importantly, RCS provides a general framework…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
