TL;DR
This paper introduces an unsupervised layered image decomposition method that models objects as explicit prototypes and predicts their transformations and occlusions, enabling accurate object separation and discovery in both synthetic and real images.
Contribution
The authors propose a novel unsupervised framework that explicitly models object prototypes and their transformations, achieving state-of-the-art results and applicability to real-world images.
Findings
Achieves state-of-the-art results on synthetic benchmarks.
Successfully applies to real images for clustering and object discovery.
First layered decomposition method learning explicit object concepts.
Abstract
We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object prototypes in the form of learnable images with a transparency channel, which we refer to as sprites; (ii) differentiable parametric functions predicting occlusions and transformation parameters necessary to instantiate the sprites in a given image; (iii) a layered image formation model with occlusion for compositing these instances into complete images including background. By jointly learning the sprites and occlusion/transformation predictors to reconstruct images, our approach not only yields accurate layered image decompositions, but also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
