Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition

Wenzhen Yuan; Wutao Xiong; Fanchen Yu; Shengji Tang; Ting Liu; Tao Chen; Peng Ye; Yuzhuo Fu; Wanli Ouyang; Lei Bai

arXiv:2604.17009·cs.AI·April 21, 2026

Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition

Wenzhen Yuan, Wutao Xiong, Fanchen Yu, Shengji Tang, Ting Liu, Tao Chen, Peng Ye, Yuzhuo Fu, Wanli Ouyang, Lei Bai

PDF

TL;DR

This paper introduces a unified parallel orchestration framework for multi-agent systems, enabling flexible, learnable coordination of agents and tools through a lightweight, robust orchestrator trained with supervised fine-tuning and reinforcement learning.

Contribution

It proposes Agent-as-Tool, a standardized paradigm for agent-tool orchestration, and develops ParaManager, a lightweight, state-aware orchestrator trained with a novel two-stage pipeline.

Findings

01

ParaManager achieves strong performance across multiple benchmarks.

02

It exhibits robust generalization under unseen model pools.

03

The approach improves system extensibility and coordination flexibility.

Abstract

Multi-agent systems (MAS) demonstrate clear advantages in tackling complex problems by coordinating diverse agents and external tools. However, most existing orchestration methods rely on static workflows or serial agent scheduling, and are further constrained by heterogeneous interface protocols between tools and agents. This leads to high system complexity and poor extensibility. To mitigate these issues, we propose Agent-as-Tool, a unified parallel orchestration paradigm that abstracts both agents and tools into a standardized, learnable action space with protocol normalization and explicit state feedback. Building on this paradigm, we train a lightweight orchestrator, ParaManager, which decouples planning decisions from subtask solving, enabling state-aware parallel subtask decomposition, delegation, and asynchronous execution. For training, we adopt a two-stage ParaManager training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.