Latency-Quality Routing for Functionally Equivalent Tools in LLM Agents

Kexin Chu; Dawei Xiang; Wei Zhang

arXiv:2605.14241·cs.LG·May 15, 2026

Latency-Quality Routing for Functionally Equivalent Tools in LLM Agents

Kexin Chu, Dawei Xiang, Wei Zhang

PDF

TL;DR

This paper introduces LQM-ContextRoute, a contextual bandit router for selecting among functionally equivalent tool providers in LLM agents, optimizing latency and quality under load.

Contribution

It proposes a novel latency-quality matching approach that ranks providers by expected answer quality per service cycle, adapting online to load and quality differences.

Findings

01

Improves F1 by +2.18 pp on web-search benchmark.

02

Enhances accuracy by up to +18 pp in StrategyQA setting.

03

Increases NDCG by +2.91 to +3.22 pp on heterogeneous retriever pools.

Abstract

Tool-augmented LLM agents increasingly access the same tool type through multiple functionally equivalent providers, such as web-search APIs, retrievers, or LLM backends exposed behind a shared interface. This creates a provider-routing problem under runtime load: the router must choose among providers that differ in latency, reliability, and answer quality, often without gold labels at deployment time. We introduce LQM-ContextRoute, a contextual bandit router for same-function tool providers. Its key design is latency-quality matching: instead of letting low latency offset poor answers in an additive reward, the router ranks providers by expected answer quality per service cycle. It combines this capacity-aware score with query-specific quality estimation and LLM-as-judge feedback, allowing it to adapt online to both load changes and provider-quality differences. On the main web-search…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.