Distributed primal-dual algorithm for constrained multi-agent reinforcement learning under coupled policies
Pengcheng Dai, He Wang, Dongming Wang, Wenwu Yu

TL;DR
This paper introduces a distributed primal-dual algorithm for constrained multi-agent reinforcement learning with coupled policies, ensuring agents optimize objectives while respecting safety constraints through local estimates and limited communication.
Contribution
It proposes a novel distributed primal-dual framework for CMARL with coupled policies, incorporating local estimates and neighborhood communication to enhance security and scalability.
Findings
Achieves $oldsymbol{ extit{ ext{epsilon}}}$-first-order stationarity with high probability.
Provides convergence bounds with approximation error depending on coupling and truncation distances.
Demonstrates effectiveness through simulations in GridWorld environment.
Abstract
In this work, we investigate constrained multi-agent reinforcement learning (CMARL), where agents collaboratively maximize the sum of their local objectives while satisfying individual safety constraints. We propose a framework where agents adopt coupled policies that depend on both local states and parameters, as well as those of their -hop neighbors, with denoting the coupling distance. A distributed primal-dual algorithm is further developed under this framework, wherein each agent has access only to state-action pairs within its -hop neighborhood and to reward information within its -hop neighborhood, with representing the truncation distance. Moreover, agents are not permitted to directly share their true policy parameters or Lagrange multipliers. Instead, each agent constructs and maintains local estimates of these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Smart Grid Security and Resilience
