A Divergence Minimization Perspective on Imitation Learning Methods
Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu

TL;DR
This paper offers a unified divergence minimization framework for understanding and comparing imitation learning methods, highlighting IRL's advantages in limited data scenarios and demonstrating diverse behavior learning without explicit rewards.
Contribution
It introduces $f$-MAX, a generalized divergence-based approach that unifies IRL methods, and provides new insights into the core differences between IRL and behavioral cloning.
Findings
IRL's state-marginal matching is key to its success
IRL outperforms BC in limited demonstration scenarios
Diverse behaviors can be learned without explicit rewards
Abstract
In many settings, it is desirable to learn decision-making and control policies through learning or bootstrapping from expert demonstrations. The most common approaches under this Imitation Learning (IL) framework are Behavioural Cloning (BC), and Inverse Reinforcement Learning (IRL). Recent methods for IRL have demonstrated the capacity to learn effective policies with access to a very limited set of demonstrations, a scenario in which BC methods often fail. Unfortunately, due to multiple factors of variation, directly comparing these methods does not provide adequate intuition for understanding this difference in performance. In this work, we present a unified probabilistic perspective on IL algorithms based on divergence minimization. We present -MAX, an -divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method. -MAX enables us to relate prior IRL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning
