Interpreting Primal-Dual Algorithms for Constrained Multiagent   Reinforcement Learning

Daniel Tabas; Ahmed S. Zamzam; Baosen Zhang

arXiv:2211.16069·eess.SY·April 28, 2023

Interpreting Primal-Dual Algorithms for Constrained Multiagent Reinforcement Learning

Daniel Tabas, Ahmed S. Zamzam, Baosen Zhang

PDF

Open Access 1 Repo

TL;DR

This paper analyzes primal-dual algorithms in constrained multiagent reinforcement learning, revealing how penalty terms influence safety and value estimation, and proposes an improved algorithm with better safety guarantees and convergence.

Contribution

It provides a theoretical reinterpretation of primal-dual methods as probabilistic constraints and introduces a novel value estimation technique for safer, faster learning.

Findings

01

Standard penalty leads to weak safety guarantees.

02

Modified penalties enforce meaningful probabilistic safety constraints.

03

Proposed value estimate accelerates convergence to safe policies.

Abstract

Constrained multiagent reinforcement learning (C-MARL) is gaining importance as MARL algorithms find new applications in real-world systems ranging from energy systems to drone swarms. Most C-MARL algorithms use a primal-dual approach to enforce constraints through a penalty function added to the reward. In this paper, we study the structural effects of this penalty term on the MARL problem. First, we show that the standard practice of using the constraint function as the penalty leads to a weak notion of safety. However, by making simple modifications to the penalty term, we can enforce meaningful probabilistic (chance and conditional value at risk) constraints. Second, we quantify the effect of the penalty term on the value function, uncovering an improved value estimation procedure. We use these insights to propose a constrained multiagent advantage actor critic (C-MAA2C) algorithm.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dtabas/multiagent-particle-envs
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning