A Divergence Minimization Perspective on Imitation Learning Methods

Seyed Kamyar Seyed Ghasemipour; Richard Zemel; Shixiang Gu

arXiv:1911.02256·cs.LG·November 7, 2019·24 cites

A Divergence Minimization Perspective on Imitation Learning Methods

Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu

PDF

Open Access 3 Repos

TL;DR

This paper offers a unified divergence minimization framework for understanding and comparing imitation learning methods, highlighting IRL's advantages in limited data scenarios and demonstrating diverse behavior learning without explicit rewards.

Contribution

It introduces $f$-MAX, a generalized divergence-based approach that unifies IRL methods, and provides new insights into the core differences between IRL and behavioral cloning.

Findings

01

IRL's state-marginal matching is key to its success

02

IRL outperforms BC in limited demonstration scenarios

03

Diverse behaviors can be learned without explicit rewards

Abstract

In many settings, it is desirable to learn decision-making and control policies through learning or bootstrapping from expert demonstrations. The most common approaches under this Imitation Learning (IL) framework are Behavioural Cloning (BC), and Inverse Reinforcement Learning (IRL). Recent methods for IRL have demonstrated the capacity to learn effective policies with access to a very limited set of demonstrations, a scenario in which BC methods often fail. Unfortunately, due to multiple factors of variation, directly comparing these methods does not provide adequate intuition for understanding this difference in performance. In this work, we present a unified probabilistic perspective on IL algorithms based on divergence minimization. We present $f$ -MAX, an $f$ -divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method. $f$ -MAX enables us to relate prior IRL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning