Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Haozhen Zhang; Tao Feng; Jiaxuan You

arXiv:2506.09033·cs.CL·October 27, 2025

Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Haozhen Zhang, Tao Feng, Jiaxuan You

PDF

1 Repo 1 Video

TL;DR

Router-R1 introduces a reinforcement learning framework that enables large language models to perform multi-round routing and aggregation, improving task performance by dynamically selecting and combining multiple models.

Contribution

The paper presents Router-R1, a novel RL-based approach allowing LLMs to perform multi-round routing and aggregation, enhancing multi-model collaboration for complex tasks.

Findings

01

Router-R1 outperforms strong baselines on seven QA benchmarks.

02

It achieves better performance while managing costs effectively.

03

The method generalizes well to unseen models.

Abstract

The rapid emergence of diverse large language models (LLMs) has spurred the development of LLM routers that assign user queries to the most suitable model. However, existing LLM routers typically perform a single-round, one-to-one mapping (\textit{i.e.}, assigning each query to a single model in isolation), which limits their capability to tackle complex tasks that demand the complementary strengths of multiple LLMs. In this paper, we present \textbf{Router-R1}, a reinforcement learning (RL)-based framework that formulates multi-LLM routing and aggregation as a sequential decision process. Router-R1 instantiates the router itself as a capable LLM, leveraging its reasoning ability to interleave "think" actions (internal deliberation) with "route" actions (dynamic model invocation), and integrates each response into its evolving context. To facilitate learning, we employ a lightweight…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ulab-uiuc/router-r1
pytorchOfficial

Videos

Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning· slideslive