Grounded Object Centric Learning

Avinash Kori; Francesco Locatello; Fabio De Sousa Ribeiro; Francesca; Toni; Ben Glocker

arXiv:2307.09437·cs.LG·January 26, 2024

Grounded Object Centric Learning

Avinash Kori, Francesco Locatello, Fabio De Sousa Ribeiro, Francesca, Toni, Ben Glocker

PDF

Open Access

TL;DR

This paper introduces CoSA, a novel method for learning stable, object-specific representations using a grounded slot dictionary, improving robustness and specialization in object-centric learning tasks.

Contribution

The paper proposes CoSA with a Grounded Slot Dictionary, enabling specialized, invariant object representations and addressing limitations of Slot Attention.

Findings

01

Improved scene generation and composition performance.

02

Enhanced task adaptation capabilities.

03

Competitive results on object discovery benchmarks.

Abstract

The extraction of modular object-centric representations for downstream tasks is an emerging area of research. Learning grounded representations of objects that are guaranteed to be stable and invariant promises robust performance across different tasks and environments. Slot Attention (SA) learns object-centric representations by assigning objects to \textit{slots}, but presupposes a \textit{single} distribution from which all slots are randomly initialised. This results in an inability to learn \textit{specialized} slots which bind to specific object types and remain invariant to identity-preserving changes in object appearance. To address this, we present \emph{\textsc{Co}nditional \textsc{S}lot \textsc{A}ttention} (\textsc{CoSA}) using a novel concept of \emph{Grounded Slot Dictionary} (GSD) inspired by vector quantization. Our proposed GSD comprises (i) canonical object-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications

Methodsfail