SEMNAV: Enhancing Visual Semantic Navigation in Robotics through Semantic Segmentation

Rafael Flor-Rodr\'iguez; Carlos Guti\'errez-\'Alvarez; Francisco Javier Acevedo-Rodr\'iguez; Sergio Lafuente-Arroyo; Roberto J. L\'opez-Sastre

arXiv:2506.01418·cs.RO·May 20, 2026

SEMNAV: Enhancing Visual Semantic Navigation in Robotics through Semantic Segmentation

Rafael Flor-Rodr\'iguez, Carlos Guti\'errez-\'Alvarez, Francisco Javier Acevedo-Rodr\'iguez, Sergio Lafuente-Arroyo, Roberto J. L\'opez-Sastre

PDF

1 Repo

TL;DR

SEMNAV introduces a semantic segmentation-based approach for visual semantic navigation, significantly improving generalization and success rates in both simulated and real-world robotic environments.

Contribution

The paper proposes SEMNAV, a novel semantic segmentation-driven navigation model, and introduces the SEMNAV dataset for training and evaluating semantic-aware navigation systems.

Findings

01

SEMNAV outperforms existing models in Habitat 2.0 with higher success rates.

02

Semantic segmentation improves sim-to-real transfer in robotic navigation.

03

The code and datasets are publicly available at the provided GitHub link.

Abstract

Visual Semantic Navigation (VSN) is a fundamental problem in robotics, where an agent must navigate toward a target object in an unknown environment, mainly using visual information. Most state-of-the-art VSN models are trained in simulation environments, where rendered scenes of the real world are used, at best. These approaches typically rely on raw RGB data from the virtual scenes, which limits their ability to generalize to real-world environments due to domain adaptation issues. To tackle this problem, in this work, we propose SEMNAV, a novel approach that leverages semantic segmentation as the main visual input representation of the environment to enhance the agent's perception and decision-making capabilities. By explicitly incorporating this type of high-level semantic information, our model learns robust navigation policies that improve generalization across unseen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gramuah/semnav
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Semantic Web and Ontologies · Advanced Image and Video Retrieval Techniques