ConsistNav: Closing the Action Consistency Gap in Zero-Shot Object Navigation with Semantic Executive Control
Haosen Wang, Zhenyang Li, Yinqiang Zhang, Zongqi He, Lutao Jiang, Kai Li, Yizhou Zhao, Liaoyuan Fan, Wenjian Hou, Tingbang Liang, Yibin Wen, Defeng Gu

TL;DR
ConsistNav introduces a semantic executive control framework for zero-shot object navigation, effectively addressing the action consistency gap and improving navigation success and efficiency.
Contribution
It proposes a training-free, modular semantic executive system that enhances zero-shot ObjectNav by maintaining persistent target hypotheses and controlling action decisions.
Findings
Achieves state-of-the-art results on HM3D and MP3D datasets.
Improves success rate (SR) by 11.4% and SPL by 7.9% over baseline.
Demonstrates robustness through ablation and real-world tests.
Abstract
Zero-shot object navigation has advanced rapidly with open-vocabulary detectors, image--text models, and language-guided exploration. However, even after current methods detect a plausible target hypothesis, the agent may still oscillate between exploration and pursuit, or abandon the object near success. We identify this failure mode as an action consistency gap: semantic evidence is repeatedly reinterpreted at each step without persistent commitment across the episode. We introduce ConsistNav, a training-free zero-shot ObjectNav framework built around a semantic executive composed of three coordinated modules: Finite-State Executive Controller stages target pursuit through guarded semantic phases; Persistent Candidate Memory accumulates cross-frame target evidence into stable object hypotheses; and Stability-Aware Action Control suppresses rotational stagnation, ineffective pursuit,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
