MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

Yaolun Zhang; Yujie Zhao; Nan Wang; Yiran Wu; Jiayu Chang; Yizhao Chen; Qingyun Wu; Jishen Zhao; Huazheng Wang

arXiv:2605.14212·cs.AI·May 15, 2026

MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

Yaolun Zhang, Yujie Zhao, Nan Wang, Yiran Wu, Jiayu Chang, Yizhao Chen, Qingyun Wu, Jishen Zhao, Huazheng Wang

PDF

1 Models

TL;DR

MetaAgent-X introduces an end-to-end reinforcement learning framework that jointly optimizes the design and execution of multi-agent systems, surpassing existing methods and enabling self-designing, self-executing agents.

Contribution

It presents a novel end-to-end RL approach for automatic MAS that jointly optimizes design and execution, with new techniques for stable training and co-evolution.

Findings

01

MetaAgent-X achieves up to 21.7% performance gains over baselines.

02

Both designer and executor improve during training.

03

Stagewise co-evolution is effective for automatic MAS learning.

Abstract

Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automatic MAS design and execution. MetaAgent-X enables script-based MAS generation, execution rollout collection, and credit assignment for both designer and executor trajectories. To support stable and scalable optimization, we propose Executor Designer Hierarchical Rollout and Stagewise Co-evolution to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Mercury7353/MetaAgent-X
model· 87 dl· ♡ 4
87 dl♡ 4

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.