Systematic Failures in Collective Reasoning under Distributed Information in Multi-Agent LLMs
Yuxuan Li, Aoi Naito, Hirokazu Shirado

TL;DR
This paper introduces HiddenBench, a benchmark revealing that multi-agent LLMs struggle with collective reasoning under distributed information, often failing to recognize unexpressed knowledge, which limits their decision-making capabilities.
Contribution
The paper presents HiddenBench, a new benchmark for evaluating collective reasoning in multi-agent LLMs, and identifies systematic failures and potential improvements through structured communication protocols.
Findings
Multi-agent LLMs achieve only 30.1% accuracy under distributed info.
Failures persist across prompting strategies and worsen with larger groups.
Structured communication improves collective reasoning performance.
Abstract
Multi-agent systems built on large language models (LLMs) are expected to enhance decision-making by pooling distributed information, yet systematically evaluating this capability has remained challenging. We introduce HiddenBench, a 65-task benchmark grounded in the Hidden Profile paradigm, which isolates collective reasoning under distributed information from individual reasoning ability. Evaluating 15 frontier LLMs, we find that multi-agent LLMs achieve only 30.1% accuracy under distributed information, compared to 80.7% accuracy for single agents given complete information. We trace this gap to a systematic failure mode: agents cannot recognize or act under latent information asymmetry -- they fail to reason about what others might know but have not yet expressed, leading to premature convergence on shared evidence while critical distributed facts remain unexplored. These failures…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
