Reparameterized Variational Divergence Minimization for Stable Imitation

Dilip Arumugam; Debadeepta Dey; Alekh Agarwal; Asli Celikyilmaz; Elnaz; Nouri; Bill Dolan

arXiv:2006.10810·cs.LG·June 22, 2020·1 cites

Reparameterized Variational Divergence Minimization for Stable Imitation

Dilip Arumugam, Debadeepta Dey, Alekh Agarwal, Asli Celikyilmaz, Elnaz, Nouri, Bill Dolan

PDF

Open Access

TL;DR

This paper introduces a reparameterization technique for adversarial imitation learning to improve stability and performance in imitation learning from observation, especially in low-dimensional continuous-control tasks.

Contribution

The authors propose a novel reparameterization trick that stabilizes $f$-divergence minimization in adversarial imitation learning from observation, leading to better performance.

Findings

01

Reparameterization improves stability of $f$-divergence minimization.

02

The method outperforms baseline approaches in experiments.

03

Achieves closer imitation of expert behavior in low-dimensional tasks.

Abstract

While recent state-of-the-art results for adversarial imitation-learning algorithms are encouraging, recent works exploring the imitation learning from observation (ILO) setting, where trajectories \textit{only} contain expert observations, have not been met with the same success. Inspired by recent investigations of $f$ -divergence manipulation for the standard imitation learning setting(Ke et al., 2019; Ghasemipour et al., 2019), we here examine the extent to which variations in the choice of probabilistic divergence may yield more performant ILO algorithms. We unfortunately find that $f$ -divergence minimization through reinforcement learning is susceptible to numerical instabilities. We contribute a reparameterization trick for adversarial imitation learning to alleviate the optimization challenges of the promising $f$ -divergence minimization framework. Empirically, we demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Model Reduction and Neural Networks