Progressive Safeguards for Safe and Model-Agnostic Reinforcement   Learning

Nabil Omi; Hosein Hasanbeig; Hiteshi Sharma; Sriram K. Rajamani,; Siddhartha Sen

arXiv:2410.24096·cs.LG·November 1, 2024

Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning

Nabil Omi, Hosein Hasanbeig, Hiteshi Sharma, Sriram K. Rajamani,, Siddhartha Sen

PDF

Open Access

TL;DR

This paper introduces a formal, model-agnostic meta-learning framework for safe reinforcement learning that uses a safeguard modeled as a finite-state machine to ensure safety across tasks, enabling efficient transfer of safety knowledge.

Contribution

The paper presents a novel, flexible safety framework for reinforcement learning that is model-agnostic, capable of handling complex safety specifications, and transferable across tasks with minimal violations.

Findings

01

Agents achieve near-minimal safety violations in experiments

02

The framework is applicable from pixel-level control to language models

03

Baseline methods underperform compared to the proposed approach

Abstract

In this paper we propose a formal, model-agnostic meta-learning framework for safe reinforcement learning. Our framework is inspired by how parents safeguard their children across a progression of increasingly riskier tasks, imparting a sense of safety that is carried over from task to task. We model this as a meta-learning process where each task is synchronized with a safeguard that monitors safety and provides a reward signal to the agent. The safeguard is implemented as a finite-state machine based on a safety specification; the reward signal is formally shaped around this specification. The safety specification and its corresponding safeguard can be arbitrarily complex and non-Markovian, which adds flexibility to the training process and explainability to the learned policy. The design of the safeguard is manual but it is high-level and model-agnostic, which gives rise to an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsSparse Evolutionary Training