Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation
Abdelrhman Werby, Chenguang Huang, Martin B\"uchner, Abhinav Valada,, Wolfram Burgard

TL;DR
This paper introduces HOV-SG, a hierarchical 3D scene graph mapping method that enhances language-grounded robot navigation by representing multi-level environment concepts with open-vocabulary features, improving accuracy and efficiency.
Contribution
The paper presents a novel hierarchical open-vocabulary 3D scene graph approach that enables better semantic understanding and navigation in complex environments using vision foundation models.
Findings
Surpasses previous baselines in open-vocabulary semantic accuracy.
Reduces representation size by 75% compared to dense maps.
Demonstrates successful long-horizon navigation in real-world multi-story buildings.
Abstract
Recent open-vocabulary robot mapping methods enrich dense geometric maps with pre-trained visual-language features. While these maps allow for the prediction of point-wise saliency maps when queried for a certain language concept, large-scale environments and abstract queries beyond the object level still pose a considerable hurdle, ultimately limiting language-grounded robotic navigation. In this work, we present HOV-SG, a hierarchical open-vocabulary 3D scene graph mapping approach for language-grounded robot navigation. Leveraging open-vocabulary vision foundation models, we first obtain state-of-the-art open-vocabulary segment-level maps in 3D and subsequently construct a 3D scene graph hierarchy consisting of floor, room, and object concepts, each enriched with open-vocabulary features. Our approach is able to represent multi-story buildings and allows robotic traversal of those…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling
