Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents
Xu Chen, Shichao Xie, Zhining Gu, Lu Jia, Minghua Luo, Fei Liu, Zedong Chu, Yanfen Shen, Xiaolong Wu, Mu Xu

TL;DR
This paper introduces ABot-Explorer, an online RGB-based exploration framework that uses vision-language models to build semantic spatial memory, improving navigation efficiency and environment understanding in embodied agents.
Contribution
It presents a novel active exploration method that unifies memory construction and exploration using semantic anchors and hierarchical memory, enabling human-like navigation.
Findings
ABot-Explorer outperforms state-of-the-art methods in exploration efficiency.
The framework effectively integrates semantic navigational affordances into memory.
The generated SG-Memo supports diverse downstream tasks.
Abstract
Constructing structured spatial memory is essential for enabling long-horizon reasoning in complex embodied navigation tasks. Current memory construction predominantly relies on a decoupled, two-stage paradigm: agents first aggregate environmental data through exploration, followed by the offline reconstruction of spatial memory. However, this post-hoc and geometry-centric approach precludes agents from leveraging high-level semantic intelligence, often causing them to overlook navigationally critical landmarks (e.g., doorways and staircases) that serve as fundamental semantic anchors in human cognitive maps. To bridge this gap, we propose ABot-Explorer, a novel active exploration framework that unifies memory construction and exploration into an online, RGB-only process. At its core, ABot-Explorer leverages Large Vision-Language Models (VLMs) to distill Semantic Navigational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
