Navigating to Objects in the Real World

Theophile Gervet; Soumith Chintala; Dhruv Batra; Jitendra Malik,; Devendra Singh Chaplot

arXiv:2212.00922·cs.RO·December 5, 2022

Navigating to Objects in the Real World

Theophile Gervet, Soumith Chintala, Dhruv Batra, Jitendra Malik,, Devendra Singh Chaplot

PDF

Open Access

TL;DR

This paper empirically compares classical, modular, and end-to-end learning methods for semantic visual navigation on real robots, highlighting modular learning's robustness and identifying simulation gaps affecting real-world performance.

Contribution

It provides the first large-scale real-world evaluation of semantic navigation methods, demonstrating modular learning's effectiveness and analyzing simulation-to-reality transfer issues.

Findings

01

Modular learning achieves 90% success rate in real-world navigation.

02

End-to-end learning drops from 77% in simulation to 23% in reality.

03

Simulation gaps in images and error modes hinder reliable evaluation.

Abstract

Semantic navigation is necessary to deploy mobile robots in uncontrolled environments like our homes, schools, and hospitals. Many learning-based approaches have been proposed in response to the lack of semantic understanding of the classical pipeline for spatial navigation, which builds a geometric map using depth sensors and plans to reach point goals. Broadly, end-to-end learning approaches reactively map sensor inputs to actions with deep neural networks, while modular learning approaches enrich the classical pipeline with learning-based semantic sensing and exploration. But learned visual navigation policies have predominantly been evaluated in simulation. How well do different classes of methods work on a robot? We present a large-scale empirical study of semantic visual navigation methods comparing representative methods from classical, modular, and end-to-end learning approaches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Domain Adaptation and Few-Shot Learning