TL;DR
DRIVE-Nav introduces a directional reasoning framework for open-vocabulary navigation, reducing redundant exploration and improving efficiency through structured directional inspection and verification.
Contribution
The paper proposes a novel structured exploration framework that organizes navigation around persistent directions, enhancing efficiency and grounding reliability in open-vocabulary object navigation.
Findings
Achieves 50.2% SR and 32.6% SPL on HM3D-OVON, surpassing previous methods.
Demonstrates consistent efficiency improvements across multiple datasets.
Successfully transfers to real-world robots and physical deployment.
Abstract
Open-Vocabulary Object Navigation (OVON) requires an embodied agent to locate a language-specified target in unknown environments. Existing zero-shot methods often reason over dense frontier points under incomplete observations, causing unstable route selection, repeated revisits, and unnecessary action overhead. We present DRIVE-Nav, a structured framework that organizes exploration around persistent directions rather than raw frontiers. By inspecting encountered directions more completely and restricting subsequent decisions to still-relevant directions within a forward 240 degree view range, DRIVE-Nav reduces redundant revisits and improves path efficiency. The framework extracts and tracks directional candidates from weighted Fast Marching Method (FMM) paths, maintains representative views for semantic inspection, and combines vision-language-guided prompt enrichment with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
