Safe Multi-Agent Reinforcement Learning with Convergence to Generalized   Nash Equilibrium

Zeyang Li; Navid Azizan

arXiv:2411.15036·cs.LG·November 25, 2024

Safe Multi-Agent Reinforcement Learning with Convergence to Generalized Nash Equilibrium

Zeyang Li, Navid Azizan

PDF

Open Access

TL;DR

This paper introduces a novel safe multi-agent reinforcement learning framework that guarantees safety at every state, converges to a generalized Nash equilibrium, and outperforms existing methods in complex environments.

Contribution

It proposes a new theoretical framework using state-wise constraints and the controlled invariant set, along with a practical deep RL algorithm called MADAC for safe multi-agent learning.

Findings

01

MADAC outperforms existing safe MARL methods in benchmarks.

02

The framework guarantees convergence to a generalized Nash equilibrium.

03

The approach effectively balances safety and performance in high-dimensional systems.

Abstract

Multi-agent reinforcement learning (MARL) has achieved notable success in cooperative tasks, demonstrating impressive performance and scalability. However, deploying MARL agents in real-world applications presents critical safety challenges. Current safe MARL algorithms are largely based on the constrained Markov decision process (CMDP) framework, which enforces constraints only on discounted cumulative costs and lacks an all-time safety assurance. Moreover, these methods often overlook the feasibility issue (the system will inevitably violate state constraints within certain regions of the constraint set), resulting in either suboptimal performance or increased constraint violations. To address these challenges, we propose a novel theoretical framework for safe MARL with $state-wise$ constraints, where safety requirements are enforced at every state the agents visit. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSupply Chain and Inventory Management · Adaptive Dynamic Programming Control · Auction Theory and Applications

MethodsSparse Evolutionary Training