Out of Sight, Still in Mind: Reasoning and Planning about Unobserved Objects with Video Tracking Enabled Memory Models
Yixuan Huang, Jialin Yuan, Chanho Kim, Pupul Pradhan, Bryan Chen, Li, Fuxin, Tucker Hermans

TL;DR
This paper introduces DOOM and LOOM, memory models that enable robots to reason and plan with occluded objects by encoding object histories using transformer-based relational dynamics, improving performance in complex environments.
Contribution
The paper presents novel transformer-based memory models, DOOM and LOOM, for encoding object trajectories to enhance robotic reasoning and planning with occluded and reappearing objects.
Findings
Models perform well with varying object counts and distractors.
Outperform baseline implicit memory methods.
Effective in both simulation and real-world tests.
Abstract
Robots need to have a memory of previously observed, but currently occluded objects to work reliably in realistic environments. We investigate the problem of encoding object-oriented memory into a multi-object manipulation reasoning and planning framework. We propose DOOM and LOOM, which leverage transformer relational dynamics to encode the history of trajectories given partial-view point clouds and an object discovery and tracking engine. Our approaches can perform multiple challenging tasks including reasoning with occluded objects, novel objects appearance, and object reappearance. Throughout our extensive simulation and real-world experiments, we find that our approaches perform well in terms of different numbers of objects and different numbers of distractor actions. Furthermore, we show our approaches outperform an implicit memory baseline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Reinforcement Learning in Robotics · Artificial Intelligence in Games
