Primal Wasserstein Imitation Learning

Robert Dadashi; L\'eonard Hussenot; Matthieu Geist; Olivier Pietquin

arXiv:2006.04678·cs.LG·March 18, 2021·41 cites

Primal Wasserstein Imitation Learning

Robert Dadashi, L\'eonard Hussenot, Matthieu Geist, Olivier Pietquin

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces Primal Wasserstein Imitation Learning (PWIL), a new offline IL method that efficiently matches expert behavior using Wasserstein distance, requiring minimal fine-tuning and demonstrating strong results on continuous control tasks.

Contribution

PWIL is a novel IL approach that derives a reward function offline based on Wasserstein distance, differing from adversarial methods that require environment interactions for reward learning.

Findings

01

Successfully recovers expert behavior on MuJoCo tasks

02

Achieves sample efficiency in both agent and expert interactions

03

Matches expert behavior using Wasserstein distance as a metric

Abstract

Imitation Learning (IL) methods seek to match the behavior of an agent with that of an expert. In the present work, we propose a new IL method based on a conceptually simple algorithm: Primal Wasserstein Imitation Learning (PWIL), which ties to the primal form of the Wasserstein distance between the expert and the agent state-action distributions. We present a reward function which is derived offline, as opposed to recent adversarial IL algorithms that learn a reward function through interactions with the environment, and which requires little fine-tuning. We show that we can recover expert behavior on a variety of continuous control tasks of the MuJoCo domain in a sample efficient manner in terms of agent interactions and of expert interactions with the environment. Finally, we show that the behavior of the agent we train matches the behavior of the expert with the Wasserstein…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Primal Wasserstein Imitation Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Multimodal Machine Learning Applications

MethodsPrimal Wasserstein Imitation Learning