SemNav: A Model-Based Planner for Zero-Shot Object Goal Navigation Using Vision-Foundation Models
Arnab Debnath, Gregory J. Stein, Jana Kosecka

TL;DR
This paper introduces SemNav, a zero-shot object goal navigation framework that combines vision foundation models with a model-based planner, enabling scalable and generalizable navigation without task-specific training.
Contribution
It presents a novel integration of vision foundation models with a model-based planner for zero-shot object navigation, achieving state-of-the-art results.
Findings
Achieves top performance on HM3D dataset for zero-shot navigation.
Outperforms existing methods in success weighted by path length.
Demonstrates effective scene understanding and object localization without training.
Abstract
Object goal navigation is a fundamental task in embodied AI, where an agent is instructed to locate a target object in an unexplored environment. Traditional learning-based methods rely heavily on large-scale annotated data or require extensive interaction with the environment in a reinforcement learning setting, often failing to generalize to novel environments and limiting scalability. To overcome these challenges, we explore a zero-shot setting where the agent operates without task-specific training, enabling more scalable and adaptable solution. Recent advances in Vision Foundation Models (VFMs) offer powerful capabilities for visual understanding and reasoning, making them ideal for agents to comprehend scenes, identify relevant regions, and infer the likely locations of objects. In this work, we present a zero-shot object goal navigation framework that integrates the perceptual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Advanced Neural Network Applications
