On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC)
Washim Uddin Mondal, Mridul Agarwal, Vaneet Aggarwal, and Satish V., Ukkusuri

TL;DR
This paper establishes approximation guarantees for heterogeneous multi-agent reinforcement learning problems using mean field control, providing error bounds and a convergent policy gradient algorithm.
Contribution
It introduces theoretical bounds for approximating heterogeneous MARL with MFC and proposes a Natural Policy Gradient algorithm with convergence guarantees.
Findings
Derived error bounds for three distribution scenarios.
Proposed a NPG algorithm with convergence guarantees.
Quantified sample complexity for policy convergence.
Abstract
Mean field control (MFC) is an effective way to mitigate the curse of dimensionality of cooperative multi-agent reinforcement learning (MARL) problems. This work considers a collection of heterogeneous agents that can be segregated into classes such that the -th class contains homogeneous agents. We aim to prove approximation guarantees of the MARL problem for this heterogeneous system by its corresponding MFC problem. We consider three scenarios where the reward and transition dynamics of all agents are respectively taken to be functions of joint state and action distributions across all classes, individual distributions of each class, and marginal distributions of the entire population. We show that, in these cases, the -class MARL problem can be approximated by MFC with errors given as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Game Theory and Applications
