Imaginative World Modeling with Scene Graphs for Embodied Agent Navigation
Yue Hu, Junzhe Wu, Ruihan Xu, Hang Liu, Avery Xi, Henry X. Liu, Ram Vasudevan, Maani Ghaffari

TL;DR
This paper introduces SGImagineNav, a novel semantic navigation framework that uses scene graphs and language models to predict and explore unseen environments, significantly improving success rates in real-world and benchmark tests.
Contribution
The paper presents a new imaginative navigation approach that builds hierarchical scene graphs and leverages language models for proactive environment exploration.
Findings
Outperforms previous methods with success rates of 65.4% and 66.8% on HM3D and HSSD benchmarks.
Enables cross-floor and cross-room navigation in real-world environments.
Demonstrates improved navigation efficiency through semantic shortcuts and exploration strategies.
Abstract
Semantic navigation requires an agent to navigate toward a specified target in an unseen environment. Employing an imaginative navigation strategy that predicts future scenes before taking action, can empower the agent to find target faster. Inspired by this idea, we propose SGImagineNav, a novel imaginative navigation framework that leverages symbolic world modeling to proactively build a global environmental representation. SGImagineNav maintains an evolving hierarchical scene graphs and uses large language models to predict and explore unseen parts of the environment. While existing methods solely relying on past observations, this imaginative scene graph provides richer semantic context, enabling the agent to proactively estimate target locations. Building upon this, SGImagineNav adopts an adaptive navigation strategy that exploits semantic shortcuts when promising and explores…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games
