Teachable Reinforcement Learning via Advice Distillation

Olivia Watkins; Trevor Darrell; Pieter Abbeel; Jacob Andreas; Abhishek; Gupta

arXiv:2203.11197·cs.LG·February 21, 2023

Teachable Reinforcement Learning via Advice Distillation

Olivia Watkins, Trevor Darrell, Pieter Abbeel, Jacob Andreas, Abhishek, Gupta

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a new interactive learning paradigm where agents learn from structured advice provided by a teacher, reducing the need for extensive human supervision compared to traditional reinforcement and imitation learning methods.

Contribution

It formalizes a class of human-in-the-loop decision problems and proposes a simple algorithm enabling agents to interpret and learn from advice, improving efficiency in skill acquisition.

Findings

01

Agents learn new skills with less human supervision.

02

Advice-based learning outperforms standard reinforcement learning.

03

Agents often require less supervision than imitation learning.

Abstract

Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and extracts little information from each human intervention. Can we overcome these challenges by building agents that learn from rich, interactive feedback instead? We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher. We begin by formalizing a class of human-in-the-loop decision making problems in which multiple forms of teacher-provided advice are available to a learner. We then describe a simple learning algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rll-research/teachable
noneOfficial

Videos

Teachable Reinforcement Learning via Advice Distillation· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Mobile Crowdsensing and Crowdsourcing