An Empirical Study of Multi-Agent Collaboration for Automated Research
Yang Shen, Zhenyi Yi, Ziyi Zhao, Lijun Sun, Dongyang Li, Chin-Teng Lin, Yuhui Shi

TL;DR
This paper empirically compares different multi-agent system architectures for automated research, revealing trade-offs between stability and depth of optimization under fixed computational budgets.
Contribution
It introduces a systematic empirical framework for benchmarking multi-agent collaboration structures in automated machine learning optimization.
Findings
Subagent architecture is resilient and effective for shallow, broad searches under strict time constraints.
Agent team architecture is more fragile but better suited for deep, complex optimization with extended compute budgets.
Empirical results guide the design of adaptive multi-agent systems for automated research.
Abstract
As AI agents evolve, the community is rapidly shifting from single Large Language Models (LLMs) to Multi-Agent Systems (MAS) to overcome cognitive bottlenecks in automated research. However, the optimal multi-agent coordination framework for these autonomous agents remains largely unexplored. In this paper, we present a systematic empirical study investigating the comparative efficacy of distinct multi-agent structures for automated machine learning optimization. Utilizing a rigorously controlled, execution-based testbed equipped with Git worktree isolation and explicit global memory, we benchmark a single-agent baseline against two multi-agent paradigms: a subagent architecture (parallel exploration with post-hoc consolidation) and an agent team architecture (experts with pre-execution handoffs). By evaluating these systems under strictly fixed computational time budgets, our findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
