Wasserstein Distance guided Adversarial Imitation Learning with Reward   Shape Exploration

Ming Zhang; Yawei Wang; Xiaoteng Ma; Li Xia; Jun Yang; Zhiheng Li; Xiu; Li

arXiv:2006.03503·cs.LG·December 9, 2020

Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration

Ming Zhang, Yawei Wang, Xiaoteng Ma, Li Xia, Jun Yang, Zhiheng Li, Xiu, Li

PDF

1 Repo

TL;DR

This paper introduces WDAIL, a novel adversarial imitation learning algorithm that uses Wasserstein distance and reward shape exploration to improve stability and performance in complex continuous control tasks.

Contribution

The paper proposes a new IL method combining Wasserstein distance, PPO, and reward shape exploration, addressing limitations of JS divergence-based rewards in GAIL.

Findings

01

WDAIL achieves stable learning in complex tasks.

02

The method outperforms existing GAIL variants in MuJoCo environments.

03

Reward shape exploration enhances task-specific performance.

Abstract

The generative adversarial imitation learning (GAIL) has provided an adversarial learning framework for imitating expert policy from demonstrations in high-dimensional continuous tasks. However, almost all GAIL and its extensions only design a kind of reward function of logarithmic form in the adversarial training strategy with the Jensen-Shannon (JS) divergence for all complex environments. The fixed logarithmic type of reward function may be difficult to solve all complex tasks, and the vanishing gradients problem caused by the JS divergence will harm the adversarial learning process. In this paper, we propose a new algorithm named Wasserstein Distance guided Adversarial Imitation Learning (WDAIL) for promoting the performance of imitation learning (IL). There are three improvements in our method: (a) introducing the Wasserstein distance to obtain more appropriate measure in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mingzhangPHD/Adversarial-Imitation-Learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGenerative Adversarial Imitation Learning