Grounded Object Centric Learning
Avinash Kori, Francesco Locatello, Fabio De Sousa Ribeiro, Francesca, Toni, Ben Glocker

TL;DR
This paper introduces CoSA, a novel method for learning stable, object-specific representations using a grounded slot dictionary, improving robustness and specialization in object-centric learning tasks.
Contribution
The paper proposes CoSA with a Grounded Slot Dictionary, enabling specialized, invariant object representations and addressing limitations of Slot Attention.
Findings
Improved scene generation and composition performance.
Enhanced task adaptation capabilities.
Competitive results on object discovery benchmarks.
Abstract
The extraction of modular object-centric representations for downstream tasks is an emerging area of research. Learning grounded representations of objects that are guaranteed to be stable and invariant promises robust performance across different tasks and environments. Slot Attention (SA) learns object-centric representations by assigning objects to \textit{slots}, but presupposes a \textit{single} distribution from which all slots are randomly initialised. This results in an inability to learn \textit{specialized} slots which bind to specific object types and remain invariant to identity-preserving changes in object appearance. To address this, we present \emph{\textsc{Co}nditional \textsc{S}lot \textsc{A}ttention} (\textsc{CoSA}) using a novel concept of \emph{Grounded Slot Dictionary} (GSD) inspired by vector quantization. Our proposed GSD comprises (i) canonical object-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
Methodsfail
