TL;DR
MetaAgent-X introduces an end-to-end reinforcement learning framework that jointly optimizes the design and execution of multi-agent systems, surpassing existing methods and enabling self-designing, self-executing agents.
Contribution
It presents a novel end-to-end RL approach for automatic MAS that jointly optimizes design and execution, with new techniques for stable training and co-evolution.
Findings
MetaAgent-X achieves up to 21.7% performance gains over baselines.
Both designer and executor improve during training.
Stagewise co-evolution is effective for automatic MAS learning.
Abstract
Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automatic MAS design and execution. MetaAgent-X enables script-based MAS generation, execution rollout collection, and credit assignment for both designer and executor trajectories. To support stable and scalable optimization, we propose Executor Designer Hierarchical Rollout and Stagewise Co-evolution to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
