Deep Convolutional Inverse Graphics Network
Tejas D. Kulkarni, Will Whitney, Pushmeet Kohli, Joshua B. Tenenbaum

TL;DR
The paper introduces DC-IGN, a deep convolutional network that learns disentangled, interpretable representations of images, enabling controlled generation of images with variations in pose and lighting.
Contribution
It proposes a novel training method for disentangling transformations in image representations using convolutional networks and variational Bayes.
Findings
Successfully learns interpretable representations of pose and lighting.
Can generate images with controlled variations in pose and lighting.
Demonstrates effectiveness as a 3D rendering engine.
Abstract
This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that learns an interpretable representation of images. This representation is disentangled with respect to transformations such as out-of-plane rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm. We propose a training procedure to encourage neurons in the graphics code layer to represent a specific transformation (e.g. pose or light). Given a single input image, our model can generate new images of the same object with variations in pose and lighting. We present qualitative and quantitative results of the model's efficacy at learning a 3D rendering engine.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
MethodsStochastic Gradient Variational Bayes · Convolution
