Robust Lagrangian and Adversarial Policy Gradient for Robust Constrained   Markov Decision Processes

David M. Bossens

arXiv:2308.11267·cs.LG·May 16, 2024

Robust Lagrangian and Adversarial Policy Gradient for Robust Constrained Markov Decision Processes

David M. Bossens

PDF

Open Access 1 Repo

TL;DR

This paper introduces two novel algorithms, RCPG with Robust Lagrangian and Adversarial RCPG, to improve robustness and incremental learning in constrained Markov decision processes, demonstrating superior performance in empirical tests.

Contribution

The paper proposes two new algorithms that enhance robustness and incremental learning in RCMDPs by reformulating the worst-case dynamics based on the Lagrangian and learning adversarial policies incrementally.

Findings

01

Both algorithms outperform traditional RCPG variants.

02

Adversarial RCPG ranks among the top two in all tests.

03

Algorithms show robustness in inventory and navigation tasks.

Abstract

The robust constrained Markov decision process (RCMDP) is a recent task-modelling framework for reinforcement learning that incorporates behavioural constraints and that provides robustness to errors in the transition dynamics model through the use of an uncertainty set. Simulating RCMDPs requires computing the worst-case dynamics based on value estimates for each state, an approach which has previously been used in the Robust Constrained Policy Gradient (RCPG). Highlighting potential downsides of RCPG such as not robustifying the full constrained objective and the lack of incremental learning, this paper introduces two algorithms, called RCPG with Robust Lagrangian and Adversarial RCPG. RCPG with Robust Lagrangian modifies RCPG by taking the worst-case dynamics based on the Lagrangian rather than either the value or the constraint. Adversarial RCPG also formulates the worst-case…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bossdm/RCMDP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Fault Detection and Control Systems