R2-Router: A New Paradigm for LLM Routing with Reasoning
Jiaqi Xue, Qian Lou, Jiarong Xing, Heng Huang

TL;DR
R2-Router introduces a novel LLM routing approach that jointly optimizes LLM choice and output length budget, enabling cost-effective high-quality responses and surpassing existing methods in efficiency.
Contribution
It presents R2-Router, a new paradigm that considers output length as a controllable variable for improved LLM routing, and introduces R2-Bench, a dataset for evaluating this approach.
Findings
R2-Router achieves 4-5x lower cost than existing routers.
Joint selection of LLM and length budget improves performance.
R2-Bench effectively captures LLM behavior across budgets.
Abstract
As LLMs proliferate with diverse capabilities and costs, LLM routing has emerged by learning to predict each LLM's quality and cost for a given query, then selecting the one with high quality and low cost. However, existing routers implicitly assume a single fixed quality and cost per LLM for each query, ignoring that the same LLM's quality varies with its output length. This causes routers to exclude powerful LLMs when their estimated cost exceeds the budget, missing the opportunity that these LLMs could still deliver high quality at reduced cost with shorter outputs. To address this, we introduce R2-Router, which treats output length budget as a controllable variable and jointly selects the best LLM and length budget, enforcing the budget via length-constrained instructions. This enables R2-Router to discover that a powerful LLM with constrained output can outperform a weaker LLM at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization · Software-Defined Networks and 5G · Advanced Neural Network Applications
