Third-Person Imitation Learning
Bradly C. Stadie, Pieter Abbeel, Ilya Sutskever

TL;DR
This paper introduces a novel unsupervised third-person imitation learning method enabling agents to learn from demonstrations observed from different viewpoints without explicit state correspondence, inspired by human learning.
Contribution
It presents a domain confusion-based approach for third-person imitation learning, addressing the challenge of learning from demonstrations with viewpoint differences without supervision.
Findings
Successful learning in pointmass, reacher, and inverted pendulum environments.
Demonstrates effectiveness of domain-agnostic features in third-person imitation.
Outperforms traditional methods requiring first-person demonstrations.
Abstract
Reinforcement learning (RL) makes it possible to train agents capable of achieving sophisticated goals in complex and uncertain environments. A key difficulty in reinforcement learning is specifying a reward function for the agent to optimize. Traditionally, imitation learning in RL has been used to overcome this problem. Unfortunately, hitherto imitation learning methods tend to require that demonstrations are supplied in the first-person: the agent is provided with a sequence of states and a specification of the actions that it should have taken. While powerful, this kind of imitation learning is limited by the relatively hard problem of collecting first-person demonstrations. Humans address this problem by learning from third-person demonstrations: they observe other humans perform tasks, infer the task, and accomplish the same task themselves. In this paper, we present a method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Multimodal Machine Learning Applications
