Object Permanence Emerges in a Random Walk along Memory

Pavel Tokmakov; Allan Jabri; Jie Li; Adrien Gaidon

arXiv:2204.01784·cs.CV·June 14, 2022·6 cites

Object Permanence Emerges in a Random Walk along Memory

Pavel Tokmakov, Allan Jabri, Jie Li, Adrien Gaidon

PDF

Open Access 1 Repo

TL;DR

This paper introduces a self-supervised method for learning object permanence by optimizing temporal coherence in memory representations, enabling models to localize and predict occluded objects without human annotations.

Contribution

It presents a novel self-supervised learning approach that leverages a Markov walk on memory features to capture object permanence without explicit supervision or assumptions about object dynamics.

Findings

01

Outperforms existing methods on multiple datasets

02

Requires minimal supervision and no human annotations

03

Successfully localizes and predicts occluded objects

Abstract

This paper proposes a self-supervised objective for learning representations that localize objects under occlusion - a property known as object permanence. A central question is the choice of learning signal in cases of total occlusion. Rather than directly supervising the locations of invisible objects, we propose a self-supervised objective that requires neither human annotation, nor assumptions about object dynamics. We show that object permanence can emerge by optimizing for temporal coherence of memory: we fit a Markov walk along a space-time graph of memories, where the states in each time step are non-Markovian features from a sequence encoder. This leads to a memory representation that stores occluded objects and predicts their motion, to better localize them. The resulting model outperforms existing approaches on several datasets of increasing complexity and realism, despite…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TRI-ML/permatrack
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Robotics and Sensor-Based Localization · Domain Adaptation and Few-Shot Learning