Federate the Router: Learning Language Model Routers with Sparse and Decentralized Evaluations
Baris Askin, Shivam Patel, Anupam Nayak, Andrea Vigano, Jiin Woo, Gauri Joshi, Carlee Joe-Wong

TL;DR
This paper introduces a federated learning framework for language model routing that enables multiple clients to collaboratively learn effective routing policies without sharing sensitive data, improving model coverage and cost-efficiency.
Contribution
It presents the first federated approach for LLM routing, supporting both parametric and nonparametric routers under diverse client data distributions.
Findings
Federated training improves routing accuracy and cost-efficiency.
The framework enhances model coverage and query generalization.
Theoretical analysis confirms reduced routing suboptimality.
Abstract
Large language models (LLMs) are increasingly accessed as remotely hosted services by edge and enterprise clients that cannot run frontier models locally. Since models vary widely in capability and price, routing queries to models that balance quality and inference cost is essential. Existing router approaches assume access to centralized query-model evaluation data. However, these data are often fragmented across clients, such as end users and organizations, and are privacy-sensitive, which makes centralizing data infeasible. Additionally, per-client router training is ineffective since local evaluation data is limited and covers only a restricted query distribution and a biased subset of model evaluations. We introduce the first federated framework for LLM routing, enabling clients to learn a shared routing policy from local offline query-model evaluation data. Our framework supports…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Graph Neural Networks · Big Data and Digital Economy
