Unsupervised Object Learning via Common Fate
Matthias Tangemann, Steffen Schneider, Julius von K\"ugelgen,, Francesco Locatello, Peter Gehler, Thomas Brox, Matthias K\"ummerer, Matthias, Bethge, Bernhard Sch\"olkopf

TL;DR
This paper presents an unsupervised approach to learn generative object models from videos by decomposing the problem into motion segmentation, individual object modeling, and scene composition, enabling scene generation with occlusions and varying object counts.
Contribution
It introduces a modular framework inspired by Gestalt principles for unsupervised object learning from videos, including a new dataset and scene sampling capabilities.
Findings
Models generalize beyond occlusions in input videos.
Scene sampling can produce novel configurations with unseen object counts.
Approach achieves modular scene representation for plausible scene generation.
Abstract
Learning generative object models from unlabelled videos is a long standing problem and required for causal scene modeling. We decompose this problem into three easier subtasks, and provide candidate solutions for each of them. Inspired by the Common Fate Principle of Gestalt Psychology, we first extract (noisy) masks of moving objects via unsupervised motion segmentation. Second, generative models are trained on the masks of the background and the moving objects, respectively. Third, background and foreground models are combined in a conditional "dead leaves" scene model to sample novel scene configurations where occlusions and depth layering arise naturally. To evaluate the individual stages, we introduce the Fishbowl dataset positioned between complex real-world scenes and common object-centric benchmarks of simplistic objects. We show that our approach allows learning generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
