Imitation Learning from Observation through Optimal Transport

Wei-Di Chang; Scott Fujimoto; David Meger; Gregory Dudek

arXiv:2310.01632·cs.RO·October 7, 2024·1 cites

Imitation Learning from Observation through Optimal Transport

Wei-Di Chang, Scott Fujimoto, David Meger, Gregory Dudek

PDF

Open Access

TL;DR

This paper introduces a simplified optimal transport-based method for Imitation Learning from Observation that effectively imitates expert behavior using only observational data, without learned models or adversarial training.

Contribution

It presents a novel, model-free approach leveraging Wasserstein distance for ILfO, compatible with any RL algorithm, and demonstrates superior performance on continuous control tasks.

Findings

01

Achieves expert-level performance with single trajectory observations

02

Outperforms existing ILfO methods in various tasks

03

Simplifies reward generation without adversarial training

Abstract

Imitation Learning from Observation (ILfO) is a setting in which a learner tries to imitate the behavior of an expert, using only observational data and without the direct guidance of demonstrated actions. In this paper, we re-examine optimal transport for IL, in which a reward is generated based on the Wasserstein distance between the state trajectories of the learner and expert. We show that existing methods can be simplified to generate a reward function without requiring learned models or adversarial learning. Unlike many other state-of-the-art methods, our approach can be integrated with any RL algorithm and is amenable to ILfO. We demonstrate the effectiveness of this simple approach on a variety of continuous control tasks and find that it surpasses the state of the art in the IlfO setting, achieving expert-level performance across a range of evaluation domains even when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition