Policy Gradients for Probabilistic Constrained Reinforcement Learning

Weiqin Chen; Dharmashankar Subramanian; Santiago Paternain

arXiv:2210.00596·cs.LG·April 20, 2023

Policy Gradients for Probabilistic Constrained Reinforcement Learning

Weiqin Chen, Dharmashankar Subramanian, Santiago Paternain

PDF

Open Access

TL;DR

This paper introduces explicit gradient expressions for probabilistic safety constraints in reinforcement learning, enabling the development of policies that maintain high safety probabilities, demonstrated through continuous navigation experiments.

Contribution

It provides the first explicit gradient formulas for probabilistic safety constraints, facilitating their integration into policy optimization algorithms.

Findings

01

Successfully derived gradient expressions for probabilistic safety constraints.

02

Empirically validated the approach in a continuous navigation task.

03

Showed that probabilistic safety can be effectively incorporated into RL policies.

Abstract

This paper considers the problem of learning safe policies in the context of reinforcement learning (RL). In particular, we consider the notion of probabilistic safety. This is, we aim to design policies that maintain the state of the system in a safe set with high probability. This notion differs from cumulative constraints often considered in the literature. The challenge of working with probabilistic safety is the lack of expressions for their gradients. Indeed, policy optimization algorithms rely on gradients of the objective function and the constraints. To the best of our knowledge, this work is the first one providing such explicit gradient expressions for probabilistic constraints. It is worth noting that the gradient of this family of constraints can be applied to various policy-based algorithms. We demonstrate empirically that it is possible to handle probabilistic constraints…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms