Uno-Orchestra: Parsimonious Agent Routing via Selective Delegation
Zhiqing Cui, Haotong Xie, Jiahao Yuan, Cheng Yang, Hanqing Wang, Yuxin Wu, Yifan Wu, Siru Zhong, Tao Yu, Yifu Guo, Siyu Zhang, Xinlei Yu, Qibing Ren, Usman Naseem

TL;DR
Uno-Orchestra is a unified RL-based agent routing system that optimizes task decomposition and worker assignment, significantly improving accuracy and reducing costs in multi-agent LLM systems.
Contribution
It introduces a novel learned orchestration policy that jointly optimizes task decomposition and worker selection, outperforming baselines across diverse benchmarks.
Findings
Achieves 77.0% macro pass@1, 16% above the best baseline.
Reduces per-query cost by roughly an order of magnitude.
Demonstrates effectiveness across math, code, knowledge, and tool-use tasks.
Abstract
Large language model (LLM) multi-agent systems typically rely on rigid orchestration, committing either to flat per-query routing or to hand-engineered task decomposition, so decomposition depth, worker choice, and inference budget are not jointly optimized under one objective. We introduce Uno-Orchestra, a unified orchestration policy that selectively decomposes a task and dispatches each subtask to an admissible (model, primitive) pair, with both decisions learned together from curated RL trajectories grounded in real worker interactions. Against 22 baselines on a 13-benchmark suite spanning math, code, knowledge, long-context, and agentic tool-use, Uno-Orchestra reaches 77.0% macro pass@1, roughly 16% above the strongest workflow baseline, at roughly an order of magnitude lower per-query cost, advancing the accuracy-efficiency frontier of selective delegation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
