Learning Exploration Policies for Navigation
Tao Chen, Saurabh Gupta, Abhinav Gupta

TL;DR
This paper presents a learning-based exploration policy for autonomous navigation in complex 3D environments, outperforming classical methods and aiding downstream tasks.
Contribution
It introduces a novel approach using spatial memory, imitation learning, and coverage rewards for effective task-agnostic exploration.
Findings
Learned policies outperform classical geometry-based exploration.
Spatial memory and imitation learning improve exploration efficiency.
Exploration policies benefit downstream navigation tasks.
Abstract
Numerous past works have tackled the problem of task-driven navigation. But, how to effectively explore a new environment to enable a variety of down-stream tasks has received much less attention. In this work, we study how agents can autonomously explore realistic and complex 3D environments without the context of task-rewards. We propose a learning-based approach and investigate different policy architectures, reward functions, and training paradigms. We find that the use of policies with spatial memory that are bootstrapped with imitation learning and finally finetuned with coverage rewards derived purely from on-board sensors can be effective at exploring novel environments. We show that our learned exploration policies can explore better than classical approaches based on geometry alone and generic learning-based exploration techniques. Finally, we also show how such task-agnostic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Robotics and Sensor-Based Localization
