BeliefMapNav: 3D Voxel-Based Belief Map for Zero-Shot Object Navigation
Zibo Zhou, Yue Hu, Lingkai Zhang, Zonglin Li, Siheng Chen

TL;DR
BeliefMapNav introduces a 3D voxel-based belief map that combines semantic reasoning and spatial understanding to improve zero-shot object navigation in complex environments, achieving state-of-the-art results.
Contribution
The paper presents a novel 3D voxel belief map that integrates semantic priors with hierarchical spatial structure for enhanced zero-shot navigation.
Findings
Achieves SOTA success rate and SPL on multiple benchmarks.
Improves SPL by 46.4% over previous methods.
Effectively combines LLM semantic reasoning with 3D spatial mapping.
Abstract
Zero-shot object navigation (ZSON) allows robots to find target objects in unfamiliar environments using natural language instructions, without relying on pre-built maps or task-specific training. Recent general-purpose models, such as large language models (LLMs) and vision-language models (VLMs), equip agents with semantic reasoning abilities to estimate target object locations in a zero-shot manner. However, these models often greedily select the next goal without maintaining a global understanding of the environment and are fundamentally limited in the spatial reasoning necessary for effective navigation. To overcome these limitations, we propose a novel 3D voxel-based belief map that estimates the target's prior presence distribution within a voxelized 3D space. This approach enables agents to integrate semantic priors from LLMs and visual embeddings with hierarchical spatial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
