R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu; Yi Ge; Yichen You; Enshu Liu; Zhihang Yuan; Guohao Dai; Shengen Yan; Huazhong Yang; Yu Wang

arXiv:2505.21600·cs.CL·November 6, 2025

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu, Yi Ge, Yichen You, Enshu Liu, Zhihang Yuan, Guohao Dai, Shengen Yan, Huazhong Yang, Yu Wang

PDF

Open Access 1 Repo 1 Models 2 Datasets 1 Video

TL;DR

R2R introduces a token routing method that selectively employs large models only for divergent reasoning tokens, significantly improving efficiency while maintaining high accuracy in complex tasks.

Contribution

The paper presents R2R, a neural token routing approach that efficiently combines small and large language models by identifying and focusing on divergent reasoning tokens.

Findings

01

R2R surpasses R1-7B accuracy by 1.6x with fewer parameters.

02

Achieves 2.8x speedup over R1-32B with similar performance.

03

Most tokens are neutral or identical, enabling efficient routing.

Abstract

Large Language Models (LLMs) achieve impressive reasoning capabilities at the cost of substantial inference overhead, posing substantial deployment challenges. Although distilled Small Language Models (SLMs) significantly enhance efficiency, their performance suffers as they fail to follow LLMs' reasoning paths. Luckily, we reveal that only a small fraction of tokens genuinely diverge reasoning paths between LLMs and SLMs. Most generated tokens are either identical or exhibit neutral differences, such as minor variations in abbreviations or expressions. Leveraging this insight, we introduce **Roads to Rome (R2R)**, a neural token routing method that selectively utilizes LLMs only for these critical, path-divergent tokens, while leaving the majority of token generation to the SLM. We also develop an automatic data generation pipeline that identifies divergent tokens and generates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thu-nics/r2r
pytorchOfficial

Models

🤗
nics-efc/R2R_router
model· ♡ 1
♡ 1

Datasets

Videos

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science

MethodsRank-One Model Editing