Open Scene Graphs for Open-World Object-Goal Navigation

Joel Loo; Zhanxin Wu; David Hsu

arXiv:2508.04678·cs.RO·August 7, 2025

Open Scene Graphs for Open-World Object-Goal Navigation

Joel Loo, Zhanxin Wu, David Hsu

PDF

TL;DR

This paper introduces OSG Navigator, a modular system using foundation models and open scene graphs to enable robots to perform open-world object goal navigation with zero-shot generalization and state-of-the-art performance.

Contribution

The paper presents a novel open scene graph representation and a modular navigation system that leverages foundation models for zero-shot open-world object navigation.

Findings

01

Achieves state-of-the-art results on ObjectNav benchmarks.

02

Demonstrates zero-shot generalization across diverse environments and goals.

03

Validates effectiveness on both simulation and real-world robots.

Abstract

How can we build general-purpose robot systems for open-world semantic navigation, e.g., searching a novel environment for a target object specified in natural language? To tackle this challenge, we introduce OSG Navigator, a modular system composed of foundation models, for open-world Object-Goal Navigation (ObjectNav). Foundation models provide enormous semantic knowledge about the world, but struggle to organise and maintain spatial information effectively at scale. Key to OSG Navigator is the Open Scene Graph representation, which acts as spatial memory for OSG Navigator. It organises spatial information hierarchically using OSG schemas, which are templates, each describing the common structure of a class of environments. OSG schemas can be automatically generated from simple semantic labels of a given environment, e.g., "home" or "supermarket". They enable OSG Navigator to adapt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.