Safe Exploration in Reinforcement Learning: Training Backup Control   Barrier Functions with Zero Training Time Safety Violations

Pedram Rabiee; Amirsaeid Safari

arXiv:2312.07828·eess.SY·December 10, 2024·2 cites

Safe Exploration in Reinforcement Learning: Training Backup Control Barrier Functions with Zero Training Time Safety Violations

Pedram Rabiee, Amirsaeid Safari

PDF

Open Access

TL;DR

This paper presents RLBUS, a reinforcement learning algorithm that guarantees zero safety violations during training by using backup control barrier functions and model-free RL to enlarge the safe set.

Contribution

RLBUS introduces a novel method combining backup control barrier functions with model-free RL to expand the safe exploration region in reinforcement learning.

Findings

01

Zero safety violations during training in inverted pendulum example

02

Enlarged safe set enables broader exploration

03

Improved performance without safety compromise

Abstract

This paper introduces the reinforcement learning backup shield (RLBUS), an algorithm that guarantees safe exploration in reinforcement learning (RL) by incorporating backup control barrier functions (BCBFs). RLBUS constructs an implicit control forward invariant subset of the safe set using multiple backup policies, ensuring safety in the presence of input constraints. While traditional BCBFs often result in conservative control forward-invariant sets due to the design of backup controllers, RLBUS addresses this limitation by leveraging model-free RL to train an additional backup policy, which enlarges the identified control forward invariant subset of the safe set. This approach enables the exploration of larger regions in the state space with zero safety violations during training. The effectiveness of RLBUS is demonstrated on an inverted pendulum example, where the expanded invariant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Safety Systems Engineering in Autonomy · Formal Methods in Verification

MethodsSparse Evolutionary Training