Learning Sample-Efficient Target Reaching for Mobile Robots
Arbaaz Khan, Vijay Kumar, Alejandro Ribeiro

TL;DR
This paper introduces a hierarchical, self-supervised learning architecture for mobile robots that enables efficient goal-reaching navigation using only sparse sensor data, significantly improving learning speed over existing methods.
Contribution
The paper presents a novel hierarchical architecture with a modified policy gradient algorithm that enhances sample efficiency for goal-reaching tasks in mobile robots.
Findings
Achieves goal reaching significantly faster than current state-of-the-art algorithms.
Effective navigation using only sparse range-finder measurements.
Validated through simulation and real-world experiments.
Abstract
In this paper, we propose a novel architecture and a self-supervised policy gradient algorithm, which employs unsupervised auxiliary tasks to enable a mobile robot to learn how to navigate to a given goal. The dependency on the global information is eliminated by providing only sparse range-finder measurements to the robot. The partially observable planning problem is addressed by splitting it into a hierarchical process. We use convolutional networks to plan locally, and a differentiable memory to provide information about past time steps in the trajectory. These modules, combined in our network architecture, produce globally consistent plans. The sparse reward problem is mitigated by our modified policy gradient algorithm. We model the robots uncertainty with unsupervised tasks to force exploration. The novel architecture we propose with the modified version of the policy gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
