Silo-Bench: A Scalable Environment for Evaluating Distributed Coordination in Multi-Agent LLM Systems

Yuzhe Zhang; Feiran Liu; Yi Shan; Xinyi Huang; Xin Yang; Yueqi Zhu; Xuxin Cheng; Cao Liu; Ke Zeng; Terry Jingchen Zhang; Wenyuan Jiang

arXiv:2603.01045·cs.MA·April 15, 2026

Silo-Bench: A Scalable Environment for Evaluating Distributed Coordination in Multi-Agent LLM Systems

Yuzhe Zhang, Feiran Liu, Yi Shan, Xinyi Huang, Xin Yang, Yueqi Zhu, Xuxin Cheng, Cao Liu, Ke Zeng, Terry Jingchen Zhang, Wenyuan Jiang

PDF

1 Repo

TL;DR

SILO-BENCH is a comprehensive benchmark testing multi-agent LLM systems' ability to coordinate and synthesize distributed information, revealing a fundamental reasoning gap that worsens with scale.

Contribution

Introduces SILO-BENCH, a novel benchmark with 30 tasks across communication levels, exposing coordination and reasoning limitations in multi-agent LLM systems.

Findings

01

Agents form task-appropriate topologies and exchange info effectively.

02

Fail to synthesize distributed info into correct answers during reasoning.

03

Coordination overhead increases with scale, negating parallelization benefits.

Abstract

Large language models are increasingly deployed in multi-agent systems to overcome context limitations by distributing information across agents. Yet whether agents can reliably compute with distributed information, rather than merely exchange it, remains an open question. We introduce SILO-BENCH, a role-agnostic benchmark of 30 algorithmic tasks across three communication complexity levels, evaluating 54 configurations over 1,620 experiments. Our experiments expose a fundamental Communication-Reasoning Gap: agents spontaneously form task-appropriate coordination topologies and exchange information actively, yet systematically fail to synthesize distributed state into correct answers. The failure is localized to the reasoning-integration stage where agents often acquire sufficient information but cannot integrate it. This coordination overhead compounds with scale, eventually…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jwyjohn/acl26-silo-bench
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.