Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning
Zida Wu, Mathieu Lauriere, Samuel Jia Cong Chua, Matthieu Geist,, Olivier Pietquin, Ankur Mehta

TL;DR
This paper introduces a novel deep reinforcement learning algorithm for mean-field games that efficiently learns population-dependent Nash equilibria without averaging or sampling, demonstrating superior convergence in multiple examples.
Contribution
The paper presents a population-aware online mirror descent algorithm for MFGs that improves convergence and stability over existing methods, using an inner-loop replay buffer.
Findings
Achieves population-dependent Nash equilibrium without averaging or sampling.
Demonstrates better convergence than state-of-the-art algorithms.
Applicable to various initial distributions.
Abstract
Mean Field Games (MFGs) have the ability to handle large-scale multi-agent systems, but learning Nash equilibria in MFGs remains a challenging task. In this paper, we propose a deep reinforcement learning (DRL) algorithm that achieves population-dependent Nash equilibrium without the need for averaging or sampling from history, inspired by Munchausen RL and Online Mirror Descent. Through the design of an additional inner-loop replay buffer, the agents can effectively learn to achieve Nash equilibrium from any distribution, mitigating catastrophic forgetting. The resulting policy can be applied to various initial distributions. Numerical experiments on four canonical examples demonstrate our algorithm has better convergence properties than SOTA algorithms, in particular a DRL version of Fictitious Play for population-dependent policies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
