Safer Reinforcement Learning through Transferable Instinct Networks

Djordje Grbic; Sebastian Risi

arXiv:2107.06686·cs.LG·July 15, 2021

Safer Reinforcement Learning through Transferable Instinct Networks

Djordje Grbic, Sebastian Risi

PDF

1 Repo

TL;DR

This paper introduces IR^2L, a reinforcement learning method that uses a pre-trained instinct network to override unsafe actions, significantly reducing safety violations during training in safety-critical environments.

Contribution

The paper proposes a transferable instinct network that enhances safety in reinforcement learning by overriding unsafe actions, trained on a single task and applied to new environments.

Findings

01

IR^2L reduces safety violations compared to baseline RL.

02

IR^2L achieves similar task performance as baseline.

03

Pre-trained instinct network effectively transfers to new environments.

Abstract

Random exploration is one of the main mechanisms through which reinforcement learning (RL) finds well-performing policies. However, it can lead to undesirable or catastrophic outcomes when learning online in safety-critical environments. In fact, safe learning is one of the major obstacles towards real-world agents that can learn during deployment. One way of ensuring that agents respect hard limitations is to explicitly configure boundaries in which they can operate. While this might work in some cases, we do not always have clear a-priori information which states and actions can lead dangerously close to hazardous states. Here, we present an approach where an additional policy can override the main policy and offer a safer alternative action. In our instinct-regulated RL (IR^2L) approach, an "instinctual" network is trained to recognize undesirable situations, while guarding the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

djole/IR2L
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.