Asymptotic Extinction in Large Coordination Games
Desmond Chan, Bart De Keijzer, Tobias Galla, Stefanos Leonardos,, Carmine Ventre

TL;DR
This paper analyzes how Q-Learning behaves in large multiplayer coordination games, revealing conditions for convergence and a phenomenon called asymptotic extinction where many actions become rarely played as the game size grows.
Contribution
It characterizes the critical exploration rate for convergence in large games and introduces the concept of asymptotic extinction of actions as game size increases.
Findings
Critical exploration rate increases with number of players and payoff alignment.
Q-Learning converges to a boundary of the action space in large games.
Asymptotic extinction causes many actions to have near-zero probability in large games.
Abstract
We study the exploration-exploitation trade-off for large multiplayer coordination games where players strategise via Q-Learning, a common learning framework in multi-agent reinforcement learning. Q-Learning is known to have two shortcomings, namely non-convergence and potential equilibrium selection problems, when there are multiple fixed points, called Quantal Response Equilibria (QRE). Furthermore, whilst QRE have full support for finite games, it is not clear how Q-Learning behaves as the game becomes large. In this paper, we characterise the critical exploration rate that guarantees convergence to a unique fixed point, addressing the two shortcomings above. Using a generating-functional method, we show that this rate increases with the number of players and the alignment of their payoffs. For many-player coordination games with perfectly aligned payoffs, this exploration rate is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Mathematical Modeling in Engineering
