Superminds Test: Actively Evaluating Collective Intelligence of Agent Society via Probing Agents
Xirui Li, Ming Li, Yunze Xiao, Ryan Wong, Dianqi Li, Timothy Baldwin, Tianyi Zhou

TL;DR
This paper empirically evaluates whether large-scale agent societies exhibit collective intelligence, finding that current systems lack emergent intelligence due to shallow interactions and limited information exchange.
Contribution
It introduces Superminds Test, a hierarchical probing framework to assess collective intelligence in large agent populations, revealing key limitations in current systems.
Findings
Society does not outperform individual models on complex reasoning.
Distributed information synthesis is rarely achieved.
Interactions are shallow and often off-topic.
Abstract
Collective intelligence refers to the ability of a group to achieve outcomes beyond what any individual member can accomplish alone. As large language model agents scale to populations of millions, a key question arises: Does collective intelligence emerge spontaneously from scale? We present the first empirical evaluation of this question in a large-scale autonomous agent society. Studying MoltBook, a platform hosting over two million agents, we introduce Superminds Test, a hierarchical framework that probes society-level intelligence using controlled Probing Agents across three tiers: joint reasoning, information synthesis, and basic interaction. Our experiments reveal a stark absence of collective intelligence. The society fails to outperform individual frontier models on complex reasoning tasks, rarely synthesizes distributed information, and often fails even trivial coordination…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
