Safe Multi-Agent Reinforcement Learning with Convergence to Generalized Nash Equilibrium
Zeyang Li, Navid Azizan

TL;DR
This paper introduces a novel safe multi-agent reinforcement learning framework that guarantees safety at every state, converges to a generalized Nash equilibrium, and outperforms existing methods in complex environments.
Contribution
It proposes a new theoretical framework using state-wise constraints and the controlled invariant set, along with a practical deep RL algorithm called MADAC for safe multi-agent learning.
Findings
MADAC outperforms existing safe MARL methods in benchmarks.
The framework guarantees convergence to a generalized Nash equilibrium.
The approach effectively balances safety and performance in high-dimensional systems.
Abstract
Multi-agent reinforcement learning (MARL) has achieved notable success in cooperative tasks, demonstrating impressive performance and scalability. However, deploying MARL agents in real-world applications presents critical safety challenges. Current safe MARL algorithms are largely based on the constrained Markov decision process (CMDP) framework, which enforces constraints only on discounted cumulative costs and lacks an all-time safety assurance. Moreover, these methods often overlook the feasibility issue (the system will inevitably violate state constraints within certain regions of the constraint set), resulting in either suboptimal performance or increased constraint violations. To address these challenges, we propose a novel theoretical framework for safe MARL with constraints, where safety requirements are enforced at every state the agents visit. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management · Adaptive Dynamic Programming Control · Auction Theory and Applications
MethodsSparse Evolutionary Training
