Object-centric architectures enable efficient causal representation learning
Amin Mansouri, Jason Hartford, Yan Zhang, Yoshua Bengio

TL;DR
This paper introduces an object-centric architecture that combines weak supervision and causal learning to improve disentanglement of object properties in images, addressing limitations of traditional methods with non-injective generative functions.
Contribution
It modifies the Slot Attention architecture to incorporate weak supervision, enabling more data-efficient disentanglement of object properties in images.
Findings
Successfully disentangles object properties in image experiments
Requires fewer perturbations than Euclidean-based approaches
Addresses non-injectivity issues in object-based generative models
Abstract
Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees (up to some reasonable equivalence class). Common to all of these approaches is the assumption that (1) the latent variables are represented as -dimensional vectors, and (2) that the observations are the output of some injective generative function of these latent variables. While these assumptions appear benign, we show that when the observations are of multiple objects, the generative function is no longer injective and disentanglement fails in practice. We can address this failure by combining recent developments in object-centric learning and causal representation learning. By modifying the Slot Attention architecture arXiv:2006.15055, we develop an object-centric architecture that leverages weak supervision from sparse perturbations to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Handwritten Text Recognition Techniques
MethodsSparse Evolutionary Training
