Single-Agent Scaling Fails Multi-Agent Intelligence: Towards Foundation Models with Native Multi-Agent Intelligence

Shuyue Hu; Haoyang Yan; Yiqun Zhang; Yang Chen; Dongzhan Zhou; Lei Bai

arXiv:2512.08743·cs.AI·December 17, 2025

Single-Agent Scaling Fails Multi-Agent Intelligence: Towards Foundation Models with Native Multi-Agent Intelligence

Shuyue Hu, Haoyang Yan, Yiqun Zhang, Yang Chen, Dongzhan Zhou, Lei Bai

PDF

Open Access

TL;DR

This paper demonstrates that scaling foundation models for single-agent tasks does not automatically produce effective multi-agent intelligence, highlighting the need for specialized development in this area.

Contribution

The paper provides empirical evidence that current large language models lack robust multi-agent capabilities and outlines future research directions for developing native multi-agent foundation models.

Findings

01

Scaling models improves single-agent performance but not multi-agent abilities.

02

Extensive experiments across 41 models and 7 benchmarks show the gap in multi-agent intelligence.

03

Identifies key research areas for advancing multi-agent foundation models.

Abstract

Foundation models (FMs) are increasingly assuming the role of the ''brain'' of AI agents. While recent efforts have begun to equip FMs with native single-agent abilities -- such as GUI interaction or integrated tool use -- we argue that the next frontier is endowing FMs with native multi-agent intelligence. We identify four core capabilities of FMs in multi-agent contexts: understanding, planning, efficient communication, and adaptation. Contrary to assumptions about the spontaneous emergence of such abilities, we provide extensive empirical evidence, across 41 large language models and 7 challenging benchmarks, showing that scaling single-agent performance alone does not automatically yield robust multi-agent intelligence. To address this gap, we outline key research directions -- spanning dataset construction, evaluation, training paradigms, and safety considerations -- for building…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)