TL;DR
GIRAFFE introduces a 3D compositional scene representation using neural feature fields, enabling controllable, realistic image synthesis with disentangled objects, shapes, and appearances from unstructured image collections.
Contribution
The paper presents a novel 3D scene representation method that allows for disentangling objects and backgrounds, improving controllability in generative models without extra supervision.
Findings
Disentangles objects from backgrounds in scene synthesis.
Enables translation, rotation, and camera pose changes.
Operates from unstructured image collections without supervision.
Abstract
Deep generative models allow for photorealistic image synthesis at high resolutions. But for many applications, this is not enough: content creation also needs to be controllable. While several recent works investigate how to disentangle underlying factors of variation in the data, most of them operate in 2D and hence ignore that our world is three-dimensional. Further, only few works consider the compositional nature of scenes. Our key hypothesis is that incorporating a compositional 3D scene representation into the generative model leads to more controllable image synthesis. Representing scenes as compositional generative neural feature fields allows us to disentangle one or multiple objects from the background as well as individual objects' shapes and appearances while learning from unstructured and unposed image collections without any additional supervision. Combining this scene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRobinhood Customer Care Number +1-833-534-1729
