Neural Systematic Binder
Gautam Singh, Yeongbin Kim, Sungjin Ahn

TL;DR
The paper introduces SysBinder, a neural mechanism that constructs structured, object-centric scene representations from unstructured images, enabling systematic generalization and improved factor disentanglement.
Contribution
It proposes a novel Block-Slot Representation and a neural layer, SysBinder, for unsupervised, systematic, and general-purpose scene understanding across modalities.
Findings
SysBinder outperforms conventional methods in factor disentanglement.
It enables systematic generalization to unseen factor combinations.
Effective on complex scene images like CLEVR-Tex.
Abstract
The key to high-level cognition is believed to be the ability to systematically manipulate and compose knowledge pieces. While token-like structured knowledge representations are naturally provided in text, it is elusive how to obtain them for unstructured modalities such as scene images. In this paper, we propose a neural mechanism called Neural Systematic Binder or SysBinder for constructing a novel structured representation called Block-Slot Representation. In Block-Slot Representation, object-centric representations known as slots are constructed by composing a set of independent factor representations called blocks, to facilitate systematic generalization. SysBinder obtains this structure in an unsupervised way by alternatingly applying two different binding principles: spatial binding for spatial modularity across the full scene and factor binding for factor modularity within an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Advanced Image and Video Retrieval Techniques
