Variational Discriminator Bottleneck: Improving Imitation Learning,   Inverse RL, and GANs by Constraining Information Flow

Xue Bin Peng; Angjoo Kanazawa; Sam Toyer; Pieter Abbeel; Sergey Levine

arXiv:1810.00821·cs.LG·August 26, 2020·101 cites

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, Sergey Levine

PDF

Open Access 5 Repos

TL;DR

This paper introduces a variational discriminator bottleneck (VDB) that constrains information flow in adversarial models, leading to improved stability and performance in imitation learning, inverse reinforcement learning, and GANs.

Contribution

The paper proposes a novel information bottleneck technique for discriminators, enhancing training stability and effectiveness across multiple adversarial learning applications.

Findings

01

Improves imitation learning from raw video demonstrations.

02

Enables transfer and re-optimization of reward functions.

03

Enhances GAN training stability and image quality.

Abstract

Adversarial learning methods have been proposed for a wide range of applications, but the training of adversarial models can be notoriously unstable. Effectively balancing the performance of the generator and discriminator is critical, since a discriminator that achieves very high accuracy will produce relatively uninformative gradients. In this work, we propose a simple and general technique to constrain information flow in the discriminator by means of an information bottleneck. By enforcing a constraint on the mutual information between the observations and the discriminator's internal representation, we can effectively modulate the discriminator's accuracy and maintain useful and informative gradients. We demonstrate that our proposed variational discriminator bottleneck (VDB) leads to significant improvements across three distinct application areas for adversarial learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques