Expand Your SCOPE: Semantic Cognition over Potential-Based Exploration for Embodied Visual Navigation
Ningnan Wang, Weihuang Chen, Liming Chen, Haoxuan Ji, Zhongyu Guo, Xuchong Zhang, Hongbin Sun

TL;DR
This paper introduces SCOPE, a zero-shot embodied visual navigation framework that leverages frontier information and a spatio-temporal potential graph to improve long-horizon planning and decision-making.
Contribution
SCOPE is the first to explicitly incorporate frontier boundary information and a self-reconsideration mechanism into potential-based exploration for embodied navigation.
Findings
SCOPE outperforms baselines by 4.6% in accuracy.
It improves calibration and generalization.
Core components enhance decision quality.
Abstract
Embodied visual navigation remains a challenging task, as agents must explore unknown environments with limited knowledge. Existing zero-shot studies have shown that incorporating memory mechanisms to support goal-directed behavior can improve long-horizon planning performance. However, they overlook visual frontier boundaries, which fundamentally dictate future trajectories and observations, and fall short of inferring the relationship between partial visual observations and navigation goals. In this paper, we propose Semantic Cognition Over Potential-based Exploration (SCOPE), a zero-shot framework that explicitly leverages frontier information to drive potential-based exploration, enabling more informed and goal-relevant decisions. SCOPE estimates exploration potential with a Vision-Language Model and organizes it into a spatio-temporal potential graph, capturing boundary dynamics to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Social Robot Interaction and HRI
