An Extensible Interactive Interface for Agent Design
Matthew Rahtz, James Fang, Anca D. Dragan, Dylan Hadfield-Menell

TL;DR
This paper introduces an interactive interface for agent design that uses demonstrations to specify complex tasks, enabling efficient learning of policies in high-dimensional environments like Lunar Lander.
Contribution
It presents a novel interactive method for task specification through demonstrations, improving policy learning efficiency over existing deep RL approaches.
Findings
Successfully learned a lunar lander policy quickly
Outperformed existing comparison-based deep RL methods
Demonstrated effectiveness in complex task learning
Abstract
In artificial intelligence, we often specify tasks through a reward function. While this works well in some settings, many tasks are hard to specify this way. In deep reinforcement learning, for example, directly specifying a reward as a function of a high-dimensional observation is challenging. Instead, we present an interface for specifying tasks interactively using demonstrations. Our approach defines a set of increasingly complex policies. The interface allows the user to switch between these policies at fixed intervals to generate demonstrations of novel, more complex, tasks. We train new policies based on these demonstrations and repeat the process. We present a case study of our approach in the Lunar Lander domain, and show that this simple approach can quickly learn a successful landing policy and outperforms an existing comparison-based deep RL method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · AI-based Problem Solving and Planning · Robotic Path Planning Algorithms
