SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model

Wencheng Zhang; Shiqin Qiao; Lingjie Luo; Yinfeng Li; Chuanyang Zheng; Qian Xu; Meng Li; Yong Gui; Yijun He; Jianing Qiu; Jindong Hong; Jiankai Sun

arXiv:2507.02822·cs.CL·July 4, 2025

SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model

Wencheng Zhang, Shiqin Qiao, Lingjie Luo, Yinfeng Li, Chuanyang Zheng, Qian Xu, Meng Li, Yong Gui, Yijun He, Jianing Qiu, Jindong Hong, Jiankai Sun

PDF

TL;DR

This paper introduces SynapseRoute, a dynamic routing framework for dual-mode large language models that intelligently assigns queries to either high-cost reasoning or low-cost non-thinking modes, optimizing accuracy and efficiency.

Contribution

It proposes a machine learning-based routing method that effectively balances accuracy and operational cost by classifying queries for appropriate mode selection.

Findings

01

Non-thinking mode can answer 58% of medical questions accurately.

02

SynapseRoute improves accuracy from 0.8272 to 0.8390.

03

It reduces inference time by 36.8% and token consumption by 39.66%.

Abstract

With the widespread adoption of large language models (LLMs) in practical applications, selecting an appropriate model requires balancing not only performance but also operational cost. The emergence of reasoning-capable models has further widened the cost gap between "thinking" (high reasoning) and "non-thinking" (fast, low-cost) modes. In this work, we reveal that approximately 58% of medical questions can be accurately answered by the non-thinking mode alone, without requiring the high-cost reasoning process. This highlights a clear dichotomy in problem complexity and suggests that dynamically routing queries to the appropriate mode based on complexity could optimize accuracy, cost-efficiency, and overall user experience. Based on this, we further propose SynapseRoute, a machine learning-based dynamic routing framework that intelligently assigns input queries to either thinking or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.