Object-centric Learning with Cyclic Walks between Parts and Whole
Ziyu Wang, Mike Zheng Shou, Mengmi Zhang

TL;DR
This paper introduces a novel object-centric learning method that uses cyclic walks between perceptual features and object representations to improve scene understanding and object segmentation in complex images.
Contribution
The paper proposes cyclic walks between features and object slots, providing a new supervisory signal for unsupervised object-centric learning without decoders.
Findings
Effective disentanglement of foregrounds and backgrounds.
Successful discovery and segmentation of objects in complex scenes.
Improved memory efficiency over decoder-based models.
Abstract
Learning object-centric representations from complex natural environments enables both humans and machines with reasoning abilities from low-level perceptual features. To capture compositional entities of the scene, we proposed cyclic walks between perceptual features extracted from vision transformers and object entities. First, a slot-attention module interfaces with these perceptual features and produces a finite set of slot representations. These slots can bind to any object entities in the scene via inter-slot competitions for attention. Next, we establish entity-feature correspondence with cyclic walks along high transition probability based on the pairwise similarity between perceptual features (aka "parts") and slot-binded object representations (aka "whole"). The whole is greater than its parts and the parts constitute the whole. The part-whole interactions form cycle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
