Reparameterized Variational Divergence Minimization for Stable Imitation
Dilip Arumugam, Debadeepta Dey, Alekh Agarwal, Asli Celikyilmaz, Elnaz, Nouri, Bill Dolan

TL;DR
This paper introduces a reparameterization technique for adversarial imitation learning to improve stability and performance in imitation learning from observation, especially in low-dimensional continuous-control tasks.
Contribution
The authors propose a novel reparameterization trick that stabilizes $f$-divergence minimization in adversarial imitation learning from observation, leading to better performance.
Findings
Reparameterization improves stability of $f$-divergence minimization.
The method outperforms baseline approaches in experiments.
Achieves closer imitation of expert behavior in low-dimensional tasks.
Abstract
While recent state-of-the-art results for adversarial imitation-learning algorithms are encouraging, recent works exploring the imitation learning from observation (ILO) setting, where trajectories \textit{only} contain expert observations, have not been met with the same success. Inspired by recent investigations of -divergence manipulation for the standard imitation learning setting(Ke et al., 2019; Ghasemipour et al., 2019), we here examine the extent to which variations in the choice of probabilistic divergence may yield more performant ILO algorithms. We unfortunately find that -divergence minimization through reinforcement learning is susceptible to numerical instabilities. We contribute a reparameterization trick for adversarial imitation learning to alleviate the optimization challenges of the promising -divergence minimization framework. Empirically, we demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Model Reduction and Neural Networks
