Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations
Daniel S. Brown, Wonjoon Goo, and Scott Niekum

TL;DR
This paper introduces D-REX, a ranking-based imitation learning method that automatically generates demonstration rankings by injecting noise, enabling the learned policy to outperform the demonstrator without extra supervision.
Contribution
The paper provides a theoretical condition for better-than-demonstrator imitation and introduces D-REX, a novel method that automatically creates ranked demonstrations for reward extrapolation.
Findings
D-REX outperforms standard imitation learning methods.
D-REX can surpass the demonstrator’s performance.
Automatic ranking generation enables effective reward learning.
Abstract
The performance of imitation learning is typically upper-bounded by the performance of the demonstrator. While recent empirical results demonstrate that ranked demonstrations allow for better-than-demonstrator performance, preferences over demonstrations may be difficult to obtain, and little is known theoretically about when such methods can be expected to successfully extrapolate beyond the performance of the demonstrator. To address these issues, we first contribute a sufficient condition for better-than-demonstrator imitation learning and provide theoretical results showing why preferences over demonstrations can better reduce reward function ambiguity when performing inverse reinforcement learning. Building on this theory, we introduce Disturbance-based Reward Extrapolation (D-REX), a ranking-based imitation learning method that injects noise into a policy learned through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Neurotransmitter Receptor Influence on Behavior
