RoBoN: Routed Online Best-of-n for Test-Time Scaling with Multiple LLMs

Jonathan Geuter; Gregor Kornhardt

arXiv:2512.05542·cs.LG·December 8, 2025

RoBoN: Routed Online Best-of-n for Test-Time Scaling with Multiple LLMs

Jonathan Geuter, Gregor Kornhardt

PDF

Open Access

TL;DR

RoBoN is a sequential, routing-based method that leverages multiple LLMs at inference time to improve response quality over traditional single-model best-of-$n$ approaches, without additional training.

Contribution

RoBoN introduces a novel online routing mechanism for multiple LLMs, enhancing test-time scaling and response accuracy without extra training or compute overhead.

Findings

01

RoBoN outperforms standard best-of-$n$ in accuracy by up to 3.4%.

02

RoBoN improves performance across various reasoning benchmarks.

03

Diversity among models can be exploited at inference to enhance results.

Abstract

Best-of- $n$ is a widely used test-time scaling approach for LLM inference. Yet despite evidence that LLMs exhibit complementary strengths across tasks, traditionally best-of- $n$ relies on a single model to generate responses. We propose RoBoN (Routed Online Best-of- $n$ ), a sequential multi-LLM alternative to the prevailing single-model best-of- $n$ . Given a suite of models ${m_{i}}_{i = 1}^{M}$ , RoBoN sequentially routes generations one-by-one across models, based on scores computed using a reward model and an agreement signal on the predicted responses. This online routing requires no additional training, keeps compute parity, and works with any plug-in reward model. Across reasoning benchmarks (MATH500, OlympiadBench, MinervaMath, GSM8K, MMLU), RoBoN consistently outperforms standard best-of- $n$ applied to each individual model for larger $n$ , with gains of up to 3.4\% in absolute…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)