Model-based Adversarial Imitation Learning

Nir Baram; Oron Anschel; Shie Mannor

arXiv:1612.02179·stat.ML·December 8, 2016·27 cites

Model-based Adversarial Imitation Learning

Nir Baram, Oron Anschel, Shie Mannor

PDF

Open Access

TL;DR

This paper introduces MAIL, a model-based adversarial imitation learning algorithm that uses a forward model for differentiability, reducing environment interactions and hyper-parameter tuning, and achieving superior results in MuJoCo tasks.

Contribution

MAIL is the first to incorporate a forward model into adversarial imitation learning, enabling fully differentiable training and improved efficiency.

Findings

01

Outperforms current state-of-the-art in MuJoCo tasks

02

Requires fewer environment interactions

03

Needs fewer hyper-parameters to tune

Abstract

Generative adversarial learning is a popular new approach to training generative models which has been proven successful for other related problems as well. The general idea is to maintain an oracle $D$ that discriminates between the expert's data distribution and that of the generative model $G$ . The generative model is trained to capture the expert's distribution by maximizing the probability of $D$ misclassifying the data it generates. Overall, the system is \emph{differentiable} end-to-end and is trained using basic backpropagation. This type of learning was successfully applied to the problem of policy imitation in a model-free setup. However, a model-free approach does not allow the system to be differentiable, which requires the use of high-variance gradient estimations. In this paper we introduce the Model based Adversarial Imitation Learning (MAIL) algorithm. A model-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Reinforcement Learning in Robotics · Model Reduction and Neural Networks