Image-Goal Navigation in Complex Environments via Modular Learning
Qiaoyun Wu, Jun Wang, Jing Liang, Xiaoxi Gong, and Dinesh Manocha

TL;DR
This paper introduces a modular approach for image-goal navigation that improves success and collision rates in complex environments by decoupling navigation components and converting image goals into point-goal tasks.
Contribution
The paper proposes a novel modular framework for image-goal navigation that enhances generalization and efficiency by separating goal planning, collision avoidance, and stopping mechanisms.
Findings
Achieves at least 17% higher success rate in real environments.
Reduces collision rate by 23% compared to state-of-the-art models.
Effective in both simulation and real-world tests.
Abstract
We present a novel approach for image-goal navigation, where an agent navigates with a goal image rather than accurate target information, which is more challenging. Our goal is to decouple the learning of navigation goal planning, collision avoidance, and navigation ending prediction, which enables more concentrated learning of each part. This is realized by four different modules. The first module maintains an obstacle map during robot navigation. The second predicts a long-term goal on the real-time map periodically, which can thus convert an image-goal navigation task to several point-goal navigation tasks. To achieve these point-goal navigation tasks, the third module plans collision-free command sets for navigating to these long-term goals. The final module stops the robot properly near the goal image. The four modules are designed or maintained separately, which helps cut down…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotic Path Planning Algorithms · Domain Adaptation and Few-Shot Learning
