Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with   General Utilities

Donghao Ying; Yunkai Zhang; Yuhao Ding; Alec Koppel; Javad Lavaei

arXiv:2305.17568·cs.LG·May 30, 2023·1 cites

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

Donghao Ying, Yunkai Zhang, Yuhao Ding, Alec Koppel, Javad Lavaei

PDF

Open Access 1 Video

TL;DR

This paper introduces a scalable primal-dual actor-critic algorithm for safe multi-agent reinforcement learning with general utilities, addressing challenges of large state-action spaces and agent safety constraints.

Contribution

It proposes a novel primal-dual method with neighbor truncation for safe multi-agent RL under general utilities, with proven convergence and sample complexity guarantees.

Findings

01

Algorithm converges to a first-order stationary point at rate O(T^{-2/3})

02

Sample complexity is approximately O(ε^{-3.5}) for ε-approximate solutions

03

Numerical experiments demonstrate the method's effectiveness

Abstract

We investigate safe multi-agent reinforcement learning, where agents seek to collectively maximize an aggregate sum of local objectives while satisfying their own safety constraints. The objective and constraints are described by {\it general utilities}, i.e., nonlinear functions of the long-term state-action occupancy measure, which encompass broader decision-making goals such as risk, exploration, or imitations. The exponential growth of the state-action space size with the number of agents presents challenges for global observability, further exacerbated by the global coupling arising from agents' safety constraints. To tackle this issue, we propose a primal-dual method utilizing shadow reward and $κ$ -hop neighbor truncation under a form of correlation decay property, where $κ$ is the communication radius. In the exact setting, our algorithm converges to a first-order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Viral Infectious Diseases and Gene Expression in Insects · Gene Regulatory Network Analysis