Independent Policy Mirror Descent for Markov Potential Games: Scaling to Large Number of Players
Pragnya Alatur, Anas Barakat, Niao He

TL;DR
This paper introduces an independent policy mirror descent algorithm for Markov Potential Games, achieving improved scalability with a square root dependence on the number of agents, suitable for large multi-agent systems.
Contribution
It demonstrates that natural policy gradient with KL regularization improves iteration complexity dependence on the number of agents in MPGs.
Findings
Iteration complexity scales as √N with natural policy gradient.
Complexity is independent of action space size.
Improves over prior linear dependence results.
Abstract
Markov Potential Games (MPGs) form an important sub-class of Markov games, which are a common framework to model multi-agent reinforcement learning problems. In particular, MPGs include as a special case the identical-interest setting where all the agents share the same reward function. Scaling the performance of Nash equilibrium learning algorithms to a large number of agents is crucial for multi-agent systems. To address this important challenge, we focus on the independent learning setting where agents can only have access to their local information to update their own policy. In prior work on MPGs, the iteration complexity for obtaining -Nash regret scales linearly with the number of agents . In this work, we investigate the iteration complexity of an independent policy mirror descent (PMD) algorithm for MPGs. We show that PMD with KL regularization, also known as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Economic Policies and Impacts
MethodsFocus
