Interactive Learning with Corrective Feedback for Policies based on Deep   Neural Networks

Rodrigo P\'erez-Dattari; Carlos Celemin; Javier Ruiz-del-Solar and; Jens Kober

arXiv:1810.00466·cs.LG·October 2, 2018

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Rodrigo P\'erez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar and, Jens Kober

PDF

1 Repo

TL;DR

This paper introduces Deep COACH, a human-in-the-loop training method for deep neural network policies that reduces data requirements by using corrective feedback instead of reward functions, demonstrated on simulated and real robot tasks.

Contribution

The paper presents Deep COACH, a novel interactive learning framework combining human corrective feedback with deep learning, eliminating the need for reward functions in policy training.

Findings

01

Faster policy learning compared to traditional DRL methods.

02

Effective in both low and high-dimensional state spaces.

03

Successfully applied to real robot and simulated tasks.

Abstract

Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback, with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of DNNs, but also has no need of a reward function (which sometimes implies the need of external perception for computing rewards). We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent's actions during execution. The D-COACH framework has the potential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rperezdattari/Interactive-Learning-with-Corrective-Feedback-for-Policies-based-on-Deep-Neural-Networks
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.