Approximate Model-Based Shielding for Safe Reinforcement Learning

Alexander W. Goodall; Francesco Belardinelli

arXiv:2308.00707·cs.LG·February 2, 2024·1 cites

Approximate Model-Based Shielding for Safe Reinforcement Learning

Alexander W. Goodall, Francesco Belardinelli

PDF

Open Access 1 Repo

TL;DR

This paper introduces approximate model-based shielding (AMBS), a novel safety verification method for reinforcement learning that does not need prior knowledge of safety dynamics, showing improved performance on Atari benchmarks.

Contribution

The paper presents AMBS, a new look-ahead shielding algorithm for safe RL that operates without prior safety dynamics knowledge, with strong theoretical backing.

Findings

01

AMBS outperforms existing safety-aware methods on Atari games.

02

AMBS provides theoretical guarantees for safety performance.

03

The approach is applicable without prior safety system knowledge.

Abstract

Reinforcement learning (RL) has shown great potential for solving complex tasks in a variety of domains. However, applying RL to safety-critical systems in the real-world is not easy as many algorithms are sample-inefficient and maximising the standard RL objective comes with no guarantees on worst-case performance. In this paper we propose approximate model-based shielding (AMBS), a principled look-ahead shielding algorithm for verifying the performance of learned RL policies w.r.t. a set of given safety constraints. Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system. We provide a strong theoretical justification for AMBS and demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sacktock/ambs
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)