Can AI Agents Agree?
Fr\'ed\'eric Berdoz, Leonardo Rugli, Roger Wattenhofer

TL;DR
This study systematically evaluates large language model-based agents in a Byzantine consensus game, revealing that they often fail to reliably reach agreement even in benign, no-stake scenarios, especially as group size increases.
Contribution
It provides the first large-scale empirical analysis of LLM agents' ability to achieve consensus in adversarial settings, highlighting current limitations.
Findings
Valid agreement is unreliable even in benign settings.
Group size and Byzantine presence significantly reduce consensus success.
Failures mainly involve loss of liveness, not subtle value corruption.
Abstract
Large language models are increasingly deployed as cooperating agents, yet their behavior in adversarial consensus settings has not been systematically studied. We evaluate LLM-based agents on a Byzantine consensus game over scalar values using a synchronous all-to-all simulation. We test consensus in a no-stake setting where agents have no preferences over the final value, so evaluation focuses on agreement rather than value optimality. Across hundreds of simulations spanning model sizes, group sizes, and Byzantine fractions, we find that valid agreement is not reliable even in benign settings and degrades as group size grows. Introducing a small number of Byzantine agents further reduces success. Failures are dominated by loss of liveness, such as timeouts and stalled convergence, rather than subtle value corruption. Overall, the results suggest that reliable agreement is not yet a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Ethics and Social Impacts of AI · Opinion Dynamics and Social Influence
