Open Scene Graphs for Open World Object-Goal Navigation
Joel Loo, Zhanxin Wu, David Hsu

TL;DR
This paper introduces Open Scene Graphs, a topo-semantic representation that, combined with foundation models, enables robots to perform open-world object-goal navigation with zero-shot generalization in diverse environments.
Contribution
It proposes Open Scene Graphs as a novel scene representation that integrates with foundation models for improved open-world navigation tasks.
Findings
OSGs improve reasoning with Large Language Models.
OpenSearch system achieves zero-shot generalization.
Robust object-goal navigation demonstrated in simulations and real-world experiments.
Abstract
How can we build robots for open-world semantic navigation tasks, like searching for target objects in novel scenes? While foundation models have the rich knowledge and generalisation needed for these tasks, a suitable scene representation is needed to connect them into a complete robot system. We address this with Open Scene Graphs (OSGs), a topo-semantic representation that retains and organises open-set scene information for these models, and has a structure that can be configured for different environment types. We integrate foundation models and OSGs into the OpenSearch system for Open World Object-Goal Navigation, which is capable of searching for open-set objects specified in natural language, while generalising zero-shot across diverse environments and embodiments. Our OSGs enhance reasoning with Large Language Models (LLM), enabling robust object-goal navigation outperforming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Multimodal Machine Learning Applications · Robotics and Sensor-Based Localization
