Emergent Coordination and Phase Structure in Independent Multi-Agent Reinforcement Learning
Azusa Yamaguchi

TL;DR
This paper investigates the phase structure of decentralized multi-agent reinforcement learning, revealing how coordination emerges, fluctuates, or collapses depending on environment size, agent density, and kernel drift, with implications for understanding multi-agent dynamics.
Contribution
It introduces a phase map for independent Q-learning, identifying distinct regimes and the role of kernel drift and synchronization in emergent coordination.
Findings
Three distinct phases identified: stable, transition, disordered.
Kernel drift is essential for phase transitions and coordination dynamics.
Removing agent identifiers eliminates drift and the phase structure.
Abstract
A clearer understanding of when coordination emerges, fluctuates, or collapses in decentralized multi-agent reinforcement learning (MARL) is increasingly sought in order to characterize the dynamics of multi-agent learning systems. We revisit fully independent Q-learning (IQL) as a minimal decentralized testbed and run large-scale experiments across environment size L and agent density rho. We construct a phase map using two axes - the cooperative success rate (CSR) and a stability index derived from TD-error variance - revealing three distinct regimes: a coordinated and stable phase, a fragile transition region, and a jammed or disordered phase. A sharp double Instability Ridge separates these regimes and corresponds to persistent kernel drift, the time-varying shift of each agent's effective transition kernel induced by others' policy updates. Synchronization analysis further shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural Networks and Reservoir Computing · stochastic dynamics and bifurcation
