TL;DR
SEMNAV introduces a semantic segmentation-based approach for visual semantic navigation, significantly improving generalization and success rates in both simulated and real-world robotic environments.
Contribution
The paper proposes SEMNAV, a novel semantic segmentation-driven navigation model, and introduces the SEMNAV dataset for training and evaluating semantic-aware navigation systems.
Findings
SEMNAV outperforms existing models in Habitat 2.0 with higher success rates.
Semantic segmentation improves sim-to-real transfer in robotic navigation.
The code and datasets are publicly available at the provided GitHub link.
Abstract
Visual Semantic Navigation (VSN) is a fundamental problem in robotics, where an agent must navigate toward a target object in an unknown environment, mainly using visual information. Most state-of-the-art VSN models are trained in simulation environments, where rendered scenes of the real world are used, at best. These approaches typically rely on raw RGB data from the virtual scenes, which limits their ability to generalize to real-world environments due to domain adaptation issues. To tackle this problem, in this work, we propose SEMNAV, a novel approach that leverages semantic segmentation as the main visual input representation of the environment to enhance the agent's perception and decision-making capabilities. By explicitly incorporating this type of high-level semantic information, our model learns robust navigation policies that improve generalization across unseen…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Semantic Web and Ontologies · Advanced Image and Video Retrieval Techniques
