Object-Centric Temporal Consistency via Conditional Autoregressive Inductive Biases
Cristian Meo, Akihiro Nakano, Mircea Lic\u{a}, Aniket Didolkar,, Masahiro Suzuki, Anirudh Goyal, Mengmi Zhang, Justin Dauwels, Yutaka Matsuo,, Yoshua Bengio

TL;DR
This paper introduces CA-SA, a novel framework that improves temporal consistency in object-centric video representations by using autoregressive priors and a new consistency loss, enhancing performance in tasks like prediction and reasoning.
Contribution
The paper proposes Conditional Autoregressive Slot Attention (CA-SA), a new method that maintains temporal consistency in object-centric video representations, outperforming existing baselines.
Findings
Outperforms baselines in video prediction tasks
Enhances temporal consistency in object representations
Improves accuracy in visual question-answering
Abstract
Unsupervised object-centric learning from videos is a promising approach towards learning compositional representations that can be applied to various downstream tasks, such as prediction and reasoning. Recently, it was shown that pretrained Vision Transformers (ViTs) can be useful to learn object-centric representations on real-world video datasets. However, while these approaches succeed at extracting objects from the scenes, the slot-based representations fail to maintain temporal consistency across consecutive frames in a video, i.e. the mapping of objects to slots changes across the video. To address this, we introduce Conditional Autoregressive Slot Attention (CA-SA), a framework that enhances the temporal consistency of extracted object-centric representations in video-centric vision tasks. Leveraging an autoregressive prior network to condition representations on previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Constraint Satisfaction and Optimization · Neural Networks and Applications
MethodsSoftmax · Attention Is All You Need
