Probabilistic Constraint for Safety-Critical Reinforcement Learning

Weiqin Chen; Dharmashankar Subramanian; Santiago Paternain

arXiv:2306.17279·cs.LG·March 14, 2024

Probabilistic Constraint for Safety-Critical Reinforcement Learning

Weiqin Chen, Dharmashankar Subramanian, Santiago Paternain

PDF

Open Access

TL;DR

This paper introduces a new gradient-based approach for safe reinforcement learning under probabilistic constraints, improving variance reduction and providing theoretical guarantees for safety and optimality.

Contribution

It develops an improved gradient estimator for probabilistic constraints, a safe primal-dual algorithm, and offers theoretical analysis and empirical validation for safe RL.

Findings

01

The new gradient estimator reduces variance compared to previous methods.

02

The safe primal-dual algorithm converges and balances safety and optimality.

03

Empirical results confirm the effectiveness of the proposed methods.

Abstract

In this paper, we consider the problem of learning safe policies for probabilistic-constrained reinforcement learning (RL). Specifically, a safe policy or controller is one that, with high probability, maintains the trajectory of the agent in a given safe set. We establish a connection between this probabilistic-constrained setting and the cumulative-constrained formulation that is frequently explored in the existing literature. We provide theoretical bounds elucidating that the probabilistic-constrained setting offers a better trade-off in terms of optimality and safety (constraint satisfaction). The challenge encountered when dealing with the probabilistic constraints, as explored in this work, arises from the absence of explicit expressions for their gradients. Our prior work provides such an explicit gradient expression for probabilistic constraints which we term Safe Policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Safety Systems Engineering in Autonomy