Teachable Reinforcement Learning via Advice Distillation
Olivia Watkins, Trevor Darrell, Pieter Abbeel, Jacob Andreas, Abhishek, Gupta

TL;DR
This paper introduces a new interactive learning paradigm where agents learn from structured advice provided by a teacher, reducing the need for extensive human supervision compared to traditional reinforcement and imitation learning methods.
Contribution
It formalizes a class of human-in-the-loop decision problems and proposes a simple algorithm enabling agents to interpret and learn from advice, improving efficiency in skill acquisition.
Findings
Agents learn new skills with less human supervision.
Advice-based learning outperforms standard reinforcement learning.
Agents often require less supervision than imitation learning.
Abstract
Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and extracts little information from each human intervention. Can we overcome these challenges by building agents that learn from rich, interactive feedback instead? We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher. We begin by formalizing a class of human-in-the-loop decision making problems in which multiple forms of teacher-provided advice are available to a learner. We then describe a simple learning algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Mobile Crowdsensing and Crowdsourcing
