Robust Constrained-MDPs: Soft-Constrained Robust Policy Optimization   under Model Uncertainty

Reazul Hasan Russel; Mouhacine Benosman; Jeroen Van Baar

arXiv:2010.04870·cs.LG·October 13, 2020·6 cites

Robust Constrained-MDPs: Soft-Constrained Robust Policy Optimization under Model Uncertainty

Reazul Hasan Russel, Mouhacine Benosman, Jeroen Van Baar

PDF

Open Access 1 Repo

TL;DR

This paper introduces robust constrained Markov decision processes (RCMDPs) to enhance reinforcement learning algorithms with performance and safety guarantees under model uncertainty, especially useful for real-world applications like Sim2Real transfer.

Contribution

The paper merges CMDP and RMDP theories to formulate RCMDPs, enabling the design of robust RL algorithms with constraint satisfaction guarantees under model uncertainty.

Findings

01

Proposed a Lagrangian-based robust policy gradient algorithm.

02

Validated the approach on an inventory management problem.

03

Demonstrated robustness and safety guarantees in uncertain environments.

Abstract

In this paper, we focus on the problem of robustifying reinforcement learning (RL) algorithms with respect to model uncertainties. Indeed, in the framework of model-based RL, we propose to merge the theory of constrained Markov decision process (CMDP), with the theory of robust Markov decision process (RMDP), leading to a formulation of robust constrained-MDPs (RCMDP). This formulation, simple in essence, allows us to design RL algorithms that are robust in performance, and provides constraint satisfaction guarantees, with respect to uncertainties in the system's states transition probabilities. The need for RCMPDs is important for real-life applications of RL. For instance, such formulation can play an important role for policy transfer from simulation to real world (Sim2Real) in safety critical applications, which would benefit from performance and safety guarantees which are robust…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bossdm/RCMDP
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Risk and Portfolio Optimization · Auction Theory and Applications