Mean-Field Approximation of Cooperative Constrained Multi-Agent   Reinforcement Learning (CMARL)

Washim Uddin Mondal; Vaneet Aggarwal; Satish V. Ukkusuri

arXiv:2209.07437·cs.LG·September 11, 2024

Mean-Field Approximation of Cooperative Constrained Multi-Agent Reinforcement Learning (CMARL)

Washim Uddin Mondal, Vaneet Aggarwal, Satish V. Ukkusuri

PDF

Open Access

TL;DR

This paper extends mean-field control methods to constrained multi-agent reinforcement learning, providing theoretical error bounds and a natural policy gradient algorithm with sample complexity guarantees.

Contribution

It introduces a mean-field approximation framework for constrained MARL and develops a natural policy gradient algorithm with proven error bounds and sample complexity.

Findings

01

Error bound of order $rac{ig[ ext{sqrt}| ext{X}| + ext{sqrt}| ext{U}|ig]}{ ext{sqrt}N}$ for constrained MARL approximation.

02

Improved error bound of order $rac{ ext{sqrt}| ext{X}|}{ ext{sqrt}N}$ in special cases.

03

Proposed algorithm achieves an error of $ ext{O}(e)$ with sample complexity $ ext{O}(e^{-6})$.

Abstract

Mean-Field Control (MFC) has recently been proven to be a scalable tool to approximately solve large-scale multi-agent reinforcement learning (MARL) problems. However, these studies are typically limited to unconstrained cumulative reward maximization framework. In this paper, we show that one can use the MFC approach to approximate the MARL problem even in the presence of constraints. Specifically, we prove that, an $N$ -agent constrained MARL problem, with state, and action spaces of each individual agents being of sizes $∣ X ∣$ , and $∣ U ∣$ respectively, can be approximated by an associated constrained MFC problem with an error, $e ≜ O ([∣ X ∣ + ∣ U ∣] / N)$ . In a special case where the reward, cost, and state transition functions are independent of the action distribution of the population, we prove that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics