Model-based Adversarial Imitation Learning
Nir Baram, Oron Anschel, Shie Mannor

TL;DR
This paper introduces MAIL, a model-based adversarial imitation learning algorithm that uses a forward model for differentiability, reducing environment interactions and hyper-parameter tuning, and achieving superior results in MuJoCo tasks.
Contribution
MAIL is the first to incorporate a forward model into adversarial imitation learning, enabling fully differentiable training and improved efficiency.
Findings
Outperforms current state-of-the-art in MuJoCo tasks
Requires fewer environment interactions
Needs fewer hyper-parameters to tune
Abstract
Generative adversarial learning is a popular new approach to training generative models which has been proven successful for other related problems as well. The general idea is to maintain an oracle that discriminates between the expert's data distribution and that of the generative model . The generative model is trained to capture the expert's distribution by maximizing the probability of misclassifying the data it generates. Overall, the system is \emph{differentiable} end-to-end and is trained using basic backpropagation. This type of learning was successfully applied to the problem of policy imitation in a model-free setup. However, a model-free approach does not allow the system to be differentiable, which requires the use of high-variance gradient estimations. In this paper we introduce the Model based Adversarial Imitation Learning (MAIL) algorithm. A model-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Reinforcement Learning in Robotics · Model Reduction and Neural Networks
