Avoidance Learning Using Observational Reinforcement Learning

David Venuto; Leonard Boussioux; Junhao Wang; Rola Dali; Jhelum; Chakravorty; Yoshua Bengio; Doina Precup

arXiv:1909.11228·cs.LG·September 26, 2019

Avoidance Learning Using Observational Reinforcement Learning

David Venuto, Leonard Boussioux, Junhao Wang, Rola Dali, Jhelum, Chakravorty, Yoshua Bengio, Doina Precup

PDF

Open Access 1 Repo

TL;DR

This paper introduces avoidance learning, a method where an agent learns to avoid dangerous behaviors demonstrated by others, improving safety and sample efficiency in complex environments.

Contribution

It proposes a novel avoidance learning framework using state occupancy distribution distances and demonstrates its effectiveness in safety and efficiency improvements.

Findings

01

Improved sample efficiency over existing methods

02

Effective avoidance of dangerous behaviors in various environments

03

Framework applicable to partially observable settings

Abstract

Imitation learning seeks to learn an expert policy from sampled demonstrations. However, in the real world, it is often difficult to find a perfect expert and avoiding dangerous behaviors becomes relevant for safety reasons. We present the idea of \textit{learning to avoid}, an objective opposite to imitation learning in some sense, where an agent learns to avoid a demonstrator policy given an environment. We define avoidance learning as the process of optimizing the agent's reward while avoiding dangerous behaviors given by a demonstrator. In this work we develop a framework of avoidance learning by defining a suitable objective function for these problems which involves the \emph{distance} of state occupancy distributions of the expert and demonstrator policies. We use density estimates for state occupancy measures and use the aforementioned distance as the reward bonus for avoiding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maximecb/gym-miniworld
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning