Benchmarking Emergent Coordination in Large-Scale LLM Populations: An Evaluation Framework on the MoltBook Archive
Brandon Yee, Pairie Koh

TL;DR
This paper introduces a comprehensive evaluation framework for emergent coordination in large-scale multi-agent LLM systems, demonstrated on the MoltBook dataset with quantitative analysis of coordination dynamics.
Contribution
It presents a novel, systematic benchmarking framework for assessing coordination, information diffusion, and role specialization in large, decentralized LLM populations.
Findings
Pronounced core-periphery structure with silhouette 0.91
Heavy-tailed cascade distributions with alpha=2.57
Severe coordination overhead in decentralized tasks (Cohen's d=-0.88)
Abstract
As multi-agent Large Language Model (LLM) systems scale, evaluating their emergent coordination dynamics becomes increasingly critical. However, current evaluation paradigms-focused on single agents or small, explicitly structured groups-fail to capture the self-organization and viral information dynamics that arise in large, decentralized populations. We introduce a systematic evaluation framework to benchmark role specialization, information diffusion, and cooperative task resolution in open agent environments. We demonstrate this framework on the MoltBook Observatory Archive, a dataset of 2.73M interactions among 90,704 autonomous agents, establishing quantitative baselines for emergent coordination. Our evaluation reveals a pronounced core-periphery structure (silhouette 0.91), heavy-tailed cascade distributions (), and severe coordination overhead in decentralized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
