Enhancing Interpretable Object Abstraction via Clustering-based Slot   Initialization

Ning Gao; Bernard Hohmann; Gerhard Neumann

arXiv:2308.11369·cs.CV·August 23, 2023

Enhancing Interpretable Object Abstraction via Clustering-based Slot Initialization

Ning Gao, Bernard Hohmann, Gerhard Neumann

PDF

Open Access

TL;DR

This paper introduces a clustering-based initialization method for object-centric slots that improves accuracy and automatically determines the number of slots, enhancing scene understanding in complex environments.

Contribution

It proposes a novel clustering-based slot initialization approach with permutation invariant/equivariant layers and automatic slot number detection, advancing object-centric representation methods.

Findings

01

Outperforms prior methods on object discovery tasks

02

Improves accuracy in complex scene representations

03

Automatically identifies the optimal number of slots

Abstract

Object-centric representations using slots have shown the advances towards efficient, flexible and interpretable abstraction from low-level perceptual features in a compositional scene. Current approaches randomize the initial state of slots followed by an iterative refinement. As we show in this paper, the random slot initialization significantly affects the accuracy of the final slot prediction. Moreover, current approaches require a predetermined number of slots from prior knowledge of the data, which limits the applicability in the real world. In our work, we initialize the slot representations with clustering algorithms conditioned on the perceptual input features. This requires an additional layer in the architecture to initialize the slots given the identified clusters. We design permutation invariant and permutation equivariant versions of this layer to enable the exchangeable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Advanced Vision and Imaging