Decentralised Learning in Systems with Many, Many Strategic Agents
David Mguni, Joel Jennings, Enrique Munoz de Cote

TL;DR
This paper introduces a scalable, decentralized multi-agent reinforcement learning method that guarantees convergence to equilibrium in systems with an extremely large number of strategic agents, validated through applications in economics and control.
Contribution
It presents a novel, model-free, decentralized learning protocol that converges to optimal policies regardless of the number of agents, addressing scalability and convergence issues.
Findings
Convergence to Nash equilibrium in large-scale multi-agent systems.
Method is model-free and requires only local information.
Validated in economics and control applications with thousands of agents.
Abstract
Although multi-agent reinforcement learning can tackle systems of strategically interacting entities, it currently fails in scalability and lacks rigorous convergence guarantees. Crucially, learning in multi-agent systems can become intractable due to the explosion in the size of the state-action space as the number of agents increases. In this paper, we propose a method for computing closed-loop optimal policies in multi-agent systems that scales independently of the number of agents. This allows us to show, for the first time, successful convergence to optimal behaviour in systems with an unbounded number of interacting adaptive learners. Studying the asymptotic regime of N-player stochastic games, we devise a learning protocol that is guaranteed to converge to equilibrium policies even when the number of agents is extremely large. Our method is model-free and completely decentralised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
