Generative Adversarial Imitation Learning
Jonathan Ho, Stefano Ermon

TL;DR
This paper introduces a novel imitation learning framework inspired by generative adversarial networks, enabling direct policy extraction from expert data with improved performance in complex, high-dimensional environments.
Contribution
It proposes a new framework that directly learns policies from expert data, bypassing inverse reinforcement learning, and introduces a model-free algorithm inspired by GANs.
Findings
Significant performance improvements over existing methods.
Effective imitation of complex behaviors in high-dimensional environments.
Framework establishes a new connection between imitation learning and GANs.
Abstract
Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert's cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis
MethodsGenerative Adversarial Imitation Learning
