Unsupervised Object-Based Transition Models for 3D Partially Observable Environments
Antonia Creswell, Rishabh Kabra, Chris Burgess, Murray Shanahan

TL;DR
This paper introduces an unsupervised, object-based transition model for 3D environments that maintains object identity over time, effectively handling occlusion and partial observability without supervision.
Contribution
The proposed model uniquely combines slot-wise object decomposition, alignment, and end-to-end training to improve object persistence and identity tracking in partially observable scenes.
Findings
Outperforms state-of-the-art baseline in object tracking accuracy.
Handles occlusion and re-appearance effectively.
Operates without supervision using object-level loss.
Abstract
We present a slot-wise, object-based transition model that decomposes a scene into objects, aligns them (with respect to a slot-wise object memory) to maintain a consistent order across time, and predicts how those objects evolve over successive frames. The model is trained end-to-end without supervision using losses at the level of the object-structured representation rather than pixels. Thanks to its alignment module, the model deals properly with two issues that are not handled satisfactorily by other transition models, namely object persistence and object identity. We show that the combination of an object-level loss and correct object alignment over time enables the model to outperform a state-of-the-art baseline, and allows it to deal well with object occlusion and re-appearance in partially observable environments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
