MONet: Unsupervised Scene Decomposition and Representation
Christopher P. Burgess, Loic Matthey, Nicholas Watters, Rishabh Kabra,, Irina Higgins, Matt Botvinick, Alexander Lerchner

TL;DR
MONet is an unsupervised model that decomposes complex scenes into meaningful components like objects and background, aiding reasoning and transfer learning.
Contribution
Introduces MONet, a novel unsupervised approach combining VAE and attention networks for scene decomposition into semantic components.
Findings
Successfully decomposes 3D scenes into meaningful objects.
Learns representations that facilitate reasoning and transfer.
Operates without supervision, using end-to-end training.
Abstract
The ability to decompose scenes in terms of abstract building blocks is crucial for general intelligence. Where those basic building blocks share meaningful properties, interactions and other regularities across scenes, such decompositions can simplify reasoning and facilitate imagination of novel scenarios. In particular, representing perceptual observations in terms of entities should improve data efficiency and transfer performance on a wide range of tasks. Thus we need models capable of discovering useful decompositions of scenes by identifying units with such regularities and representing them in a common format. To address this problem, we have developed the Multi-Object Network (MONet). In this model, a VAE is trained end-to-end together with a recurrent attention network -- in a purely unsupervised manner -- to provide attention masks around, and reconstructions of, regions of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
DeepMind's AI Learned a Better Understanding of 3D Scenes· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsSpatial Broadcast Decoder · USD Coin Customer Service Number +1-833-534-1729
