Collective Alignment in LLM Multi-Agent Systems: Disentangling Bias from Cooperation via Statistical Physics
Cristiano De Nobili

TL;DR
This paper models LLM-based multi-agent systems on a lattice, using statistical physics to distinguish social conformity from intrinsic bias, revealing that bias dominates collective alignment.
Contribution
It introduces a model-agnostic framework to analyze emergent collective behavior and phase transitions in LLM multi-agent systems, quantifying social conformity and bias.
Findings
All models show temperature-driven order-disorder crossovers.
Effective exponents are close to but not equal to 2D Ising universality.
Intrinsic bias dominates over neighbor coupling in collective alignment.
Abstract
We investigate the emergent collective dynamics of LLM-based multi-agent systems on a 2D square lattice and present a model-agnostic statistical-physics method to disentangle social conformity from intrinsic bias, compute critical exponents, and probe the collective behavior and possible phase transitions of multi-agent systems. In our framework, each node of an lattice hosts an identical LLM agent holding a binary state (/, mapped to yes/no) and updating it by querying the model conditioned on the four nearest-neighbor states. The sampler temperature serves as the sole control parameter. Across three open-weight models (llama3.1:8b, phi4-mini:3.8b, mistral:7b), we measure magnetization and susceptibility under a global-flip protocol designed to probe symmetry. All models display temperature-driven order-disorder crossovers and susceptibility…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
