Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning
Qi Cao, Shuhao Zhang, Ruizhe Zhou, Ruiyi Zhang, Peijia Qin, Pengtao Xie

TL;DR
SCOPE is a reinforcement learning-based routing framework that predicts model performance and cost, enabling dynamic, adaptable routing decisions to optimize accuracy and efficiency in language model usage.
Contribution
It introduces a novel reasoning-based prediction approach that generalizes to unseen models and allows explicit control over accuracy-cost trade-offs.
Findings
Achieves up to 25.7% accuracy improvement.
Reduces inference costs by up to 95.1%.
Adapts to new models and changing user preferences.
Abstract
Model routing chooses which language model to use for each query. By sending easy queries to cheaper models and hard queries to stronger ones, it can significantly reduce inference cost while maintaining high accuracy. However, most existing routers treat this as a fixed choice among a small set of models, which makes them hard to adapt to new models or changing budget constraints. In this paper, we propose SCOPE (Scalable and Controllable Outcome Performance Estimator), a routing framework that goes beyond model selection by predicting their cost and performance. Trained with reinforcement learning, SCOPE makes reasoning-based predictions by retrieving how models behave on similar problems, rather than relying on fixed model names, enabling it to work with new, unseen models. Moreover, by explicitly predicting how accurate and how expensive a model will be, it turns routing into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware-Defined Networks and 5G · Network Packet Processing and Optimization · Network Traffic and Congestion Control
