Partially Observable Mean Field Reinforcement Learning
Sriram Ganapathi Subramanian, Matthew E. Taylor, Mark Crowley, Pascal, Poupart

TL;DR
This paper introduces a novel mean field reinforcement learning approach that models uncertainty in the mean field, enabling scalable multi-agent learning in environments with many agents and limited visibility.
Contribution
It relaxes the assumption of exact mean field knowledge by maintaining a distribution, and proposes Q-learning algorithms for different visibility settings, with theoretical and empirical validation.
Findings
Algorithms outperform baselines in multiple games
The Q-learning estimate remains close to Nash Q-value
Effective in large multi-agent environments
Abstract
Traditional multi-agent reinforcement learning algorithms are not scalable to environments with more than a few agents, since these algorithms are exponential in the number of agents. Recent research has introduced successful methods to scale multi-agent reinforcement learning algorithms to many agent scenarios using mean field theory. Previous work in this field assumes that an agent has access to exact cumulative metrics regarding the mean field behaviour of the system, which it can then use to take its actions. In this paper, we relax this assumption and maintain a distribution to model the uncertainty regarding the mean field of the system. We consider two different settings for this problem. In the first setting, only agents in a fixed neighbourhood are visible, while in the second setting, the visibility of agents is determined at random based on distances. For each of these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Game Theory and Applications
