Deep disentangled representations for volumetric reconstruction
Edward Grant, Pushmeet Kohli, Marcel van Gerven

TL;DR
This paper presents a convolutional neural network that learns to produce disentangled 3D object representations from 2D images, enabling volumetric reconstruction and separate modeling of object identity, lighting, and pose.
Contribution
A novel neural network architecture with a dual-decoder setup that learns disentangled graphical descriptions and volumetric reconstructions from images.
Findings
Successfully generates volumes and disentangled descriptions from images and videos.
Demonstrates effective separation of object identity from lighting and pose.
Applicable to faces and chairs in diverse visual data.
Abstract
We introduce a convolutional neural network for inferring a compact disentangled graphical description of objects from 2D images that can be used for volumetric reconstruction. The network comprises an encoder and a twin-tailed decoder. The encoder generates a disentangled graphics code. The first decoder generates a volume, and the second decoder reconstructs the input image using a novel training regime that allows the graphics code to learn a separate representation of the 3D object and a description of its lighting and pose conditions. We demonstrate this method by generating volumes and disentangled graphical descriptions from images and videos of faces and chairs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques
