Object Goal Navigation with Recursive Implicit Maps
Shizhe Chen, Thomas Chabal, Ivan Laptev, Cordelia Schmid

TL;DR
This paper introduces a recursive implicit map for object goal navigation that combines the advantages of explicit mapping and end-to-end learning, leading to improved performance and real-world deployment.
Contribution
It proposes a novel implicit spatial map updated with a transformer, integrating auxiliary tasks for enhanced spatial reasoning and generalization.
Findings
Outperforms state-of-the-art on MP3D dataset
Generalizes well to HM3D dataset
Successfully deployed on a real robot in real scenes
Abstract
Object goal navigation aims to navigate an agent to locations of a given object category in unseen environments. Classical methods explicitly build maps of environments and require extensive engineering while lacking semantic information for object-oriented exploration. On the other hand, end-to-end learning methods alleviate manual map design and predict actions using implicit representations. Such methods, however, lack an explicit notion of geometry and may have limited ability to encode navigation history. In this work, we propose an implicit spatial map for object goal navigation. Our implicit map is recursively updated with new observations at each step using a transformer. To encourage spatial reasoning, we introduce auxiliary tasks and train our model to reconstruct explicit maps as well as to predict visual features, semantic labels and actions. Our method significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Advanced Image and Video Retrieval Techniques
