Effects of Safety State Augmentation on Safe Exploration

Aivar Sootla; Alexander I. Cowen-Rivers; Jun Wang; Haitham Bou Ammar

arXiv:2206.02675·cs.LG·October 13, 2022

Effects of Safety State Augmentation on Safe Exploration

Aivar Sootla, Alexander I. Cowen-Rivers, Jun Wang, Haitham Bou Ammar

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel safety state augmentation method called Simmer for safe reinforcement learning, which improves safety and stability during training by better managing safety constraints.

Contribution

The paper proposes a safety state augmentation technique that enables dynamic safety budget scheduling, enhancing safe exploration and stability in model-free RL.

Findings

01

Simmer improves safety during training in constrained RL tasks.

02

It stabilizes training and enhances performance in average cost constrained RL.

03

The approach effectively manages safety budgets during learning.

Abstract

Safe exploration is a challenging and important problem in model-free reinforcement learning (RL). Often the safety cost is sparse and unknown, which unavoidably leads to constraint violations -- a phenomenon ideally to be avoided in safety-critical applications. We tackle this problem by augmenting the state-space with a safety state, which is nonnegative if and only if the constraint is satisfied. The value of this state also serves as a distance toward constraint violation, while its initial value indicates the available safety budget. This idea allows us to derive policies for scheduling the safety budget during training. We call our approach Simmer (Safe policy IMproveMEnt for RL) to reflect the careful nature of these schedules. We apply this idea to two safe RL problems: RL with constraints imposed on an average cost, and RL with constraints imposed on a cost with probability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huawei-noah/hebo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Reinforcement Learning in Robotics