Towards causal generative scene models via competition of experts
Julius von K\"ugelgen, Ivan Ustyuzhaninov, Peter Gehler, Matthias, Bethge, Bernhard Sch\"olkopf

TL;DR
This paper introduces a modular, ensemble-based generative scene model that learns to represent and recombine objects with correct depth ordering and occlusion, advancing towards causal scene understanding.
Contribution
It proposes a novel ensemble of experts trained with an inductive bias for modularity, enabling explicit object separation and physically plausible scene recombination.
Findings
Experts specialize on different object classes
Model handles depth layering and occlusion correctly
Qualitative results demonstrate conceptual advantages
Abstract
Learning how to model complex scenes in a modular way with recombinable components is a pre-requisite for higher-order reasoning and acting in the physical world. However, current generative models lack the ability to capture the inherently compositional and layered nature of visual scenes. While recent work has made progress towards unsupervised learning of object-based scene representations, most models still maintain a global representation space (i.e., objects are not explicitly separated), and cannot generate scenes with novel object arrangement and depth ordering. Here, we present an alternative approach which uses an inductive bias encouraging modularity by training an ensemble of generative models (experts). During training, experts compete for explaining parts of a scene, and thus specialise on different object classes, with objects being identified as parts that re-occur…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Image Processing and 3D Reconstruction
