Visual Object Networks: Image Generation with Disentangled 3D Representation
Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio, Torralba, Joshua B. Tenenbaum, William T. Freeman

TL;DR
The paper introduces Visual Object Networks (VON), a generative model that synthesizes realistic images with a disentangled 3D representation, enabling 3D-aware image editing and manipulation.
Contribution
It presents a novel end-to-end adversarial framework that models shape, viewpoint, and texture separately, improving realism and enabling 3D operations in image synthesis.
Findings
VON generates more realistic images than state-of-the-art methods.
It allows changing viewpoints and editing shape and texture.
Enables transfer of appearance across objects and viewpoints.
Abstract
Recent progress in deep generative models has led to tremendous breakthroughs in image generation. However, while existing models can synthesize photorealistic images, they lack an understanding of our underlying 3D world. We present a new generative model, Visual Object Networks (VON), synthesizing natural images of objects with a disentangled 3D representation. Inspired by classic graphics rendering pipelines, we unravel our image formation process into three conditionally independent factors---shape, viewpoint, and texture---and present an end-to-end adversarial learning framework that jointly models 3D shapes and 2D images. Our model first learns to synthesize 3D shapes that are indistinguishable from real shapes. It then renders the object's 2.5D sketches (i.e., silhouette and depth map) from its shape under a sampled viewpoint. Finally, it learns to add realistic texture to these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging
