GENESIS: Generative Scene Inference and Sampling with Object-Centric   Latent Representations

Martin Engelcke; Adam R. Kosiorek; Oiwi Parker Jones; Ingmar Posner

arXiv:1907.13052·cs.LG·November 24, 2020·73 cites

GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations

Martin Engelcke, Adam R. Kosiorek, Oiwi Parker Jones, Ingmar Posner

PDF

Open Access 2 Repos

TL;DR

GENESIS is a novel object-centric generative model for 3D scenes that captures object relationships, enabling scene decomposition and principled sampling of new scenes, advancing visual scene understanding in robotics and reinforcement learning.

Contribution

It introduces GENESIS, the first model to explicitly incorporate object relationships in 3D scene generation and decomposition, improving over prior models like MONet and IODINE.

Findings

01

Effective scene decomposition demonstrated on multiple datasets.

02

Capable of generating realistic novel scenes with object interactions.

03

Improved semi-supervised learning performance.

Abstract

Generative latent-variable models are emerging as promising tools in robotics and reinforcement learning. Yet, even though tasks in these domains typically involve distinct objects, most state-of-the-art generative models do not explicitly capture the compositional nature of visual scenes. Two recent exceptions, MONet and IODINE, decompose scenes into objects in an unsupervised fashion. Their underlying generative processes, however, do not account for component interactions. Hence, neither of them allows for principled sampling of novel scenes. Here we present GENESIS, the first object-centric generative model of 3D visual scenes capable of both decomposing and generating scenes by capturing relationships between scene components. GENESIS parameterises a spatial GMM over images which is decoded from a set of object-centric latent variables that are either inferred sequentially in an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · 3D Shape Modeling and Analysis

MethodsMixture model network · Generalized ELBO with Constrained Optimization · Spatial Broadcast Decoder · Gated Linear Unit · Exponential Linear Unit · Tanh Activation · Sigmoid Activation · Long Short-Term Memory · Batch Normalization · Adam