Learning from Interventions using Hierarchical Policies for Safe   Learning

Jing Bi; Vikas Dhiman; Tianyou Xiao; Chenliang Xu

arXiv:1912.02241·cs.RO·December 6, 2019

Learning from Interventions using Hierarchical Policies for Safe Learning

Jing Bi, Vikas Dhiman, Tianyou Xiao, Chenliang Xu

PDF

Open Access

TL;DR

This paper introduces a hierarchical policy framework for Learning from Interventions that interpolates expert reactions and predicts sub-goals, enabling safer, faster, and more effective learning in complex tasks.

Contribution

It proposes a novel hierarchical approach with sub-goal prediction and intervention interpolation to improve Learning from Interventions over traditional Behavior Cloning.

Findings

01

Faster training compared to LfD.

02

Better asymptotic performance.

03

Robustness to expert reaction delays.

Abstract

Learning from Demonstrations (LfD) via Behavior Cloning (BC) works well on multiple complex tasks. However, a limitation of the typical LfD approach is that it requires expert demonstrations for all scenarios, including those in which the algorithm is already well-trained. The recently proposed Learning from Interventions (LfI) overcomes this limitation by using an expert overseer. The expert overseer only intervenes when it suspects that an unsafe action is about to be taken. Although LfI significantly improves over LfD, the state-of-the-art LfI fails to account for delay caused by the expert's reaction time and only learns short-term behavior. We address these limitations by 1) interpolating the expert's interventions back in time, and 2) by splitting the policy into two hierarchical levels, one that generates sub-goals for the future and another that generates actions to reach those…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Machine Learning and Data Classification