InfoGAN: Interpretable Representation Learning by Information Maximizing   Generative Adversarial Nets

Xi Chen; Yan Duan; Rein Houthooft; John Schulman; Ilya Sutskever,; Pieter Abbeel

arXiv:1606.03657·cs.LG·June 14, 2016·1.2k cites

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever,, Pieter Abbeel

PDF

Open Access 5 Repos

TL;DR

InfoGAN introduces an unsupervised method to learn disentangled and interpretable representations in generative models by maximizing mutual information, demonstrated on various image datasets.

Contribution

It extends GANs with an information-theoretic objective to learn interpretable features without supervision, using a novel lower bound for mutual information optimization.

Findings

01

Successfully disentangles styles, shapes, and backgrounds in images

02

Learns interpretable features comparable to supervised methods

03

Applies to diverse datasets like MNIST, SVHN, CelebA

Abstract

This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound to the mutual information objective that can be optimized efficiently, and show that our training procedure can be interpreted as a variation of the Wake-Sleep algorithm. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Human Pose and Action Recognition

MethodsDense Connections · Softmax · *Communicated@Fast*How Do I Communicate to Expedia? · Feedforward Network · HuMan(Expedia)||How do I get a human at Expedia? · Sigmoid Activation · Tanh Activation · Adam · Batch Normalization · Convolution