SAMBA: Safe Model-Based & Active Reinforcement Learning

Alexander I. Cowen-Rivers; Daniel Palenicek; Vincent Moens; Mohammed; Abdullah; Aivar Sootla; Jun Wang; Haitham Ammar

arXiv:2006.09436·cs.LG·June 18, 2020

SAMBA: Safe Model-Based & Active Reinforcement Learning

Alexander I. Cowen-Rivers, Daniel Palenicek, Vincent Moens, Mohammed, Abdullah, Aivar Sootla, Jun Wang, Haitham Ammar

PDF

Open Access 1 Repo

TL;DR

SAMBA is a new safe reinforcement learning framework that combines probabilistic models and active exploration to significantly reduce sample use and safety violations in complex dynamical systems.

Contribution

It introduces a novel multi-objective optimization approach with safety constraints for active exploration in model-based reinforcement learning.

Findings

01

Orders of magnitude reduction in samples needed.

02

Significant decrease in safety violations.

03

Effective handling of high-dimensional systems.

Abstract

In this paper, we propose SAMBA, a novel framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics. Our method builds upon PILCO to enable active exploration using novel(semi-)metrics for out-of-sample Gaussian process evaluation optimised through a multi-objective problem that supports conditional-value-at-risk constraints. We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations. Our results show orders of magnitude reductions in samples and violations compared to state-of-the-art methods. Lastly, we provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

GAMES-UChile/mogptk
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Reliability and Analysis Research · Safety Systems Engineering in Autonomy · Fault Detection and Control Systems

MethodsGaussian Process