TL;DR
INHerit-SG is a novel framework that constructs incremental, hierarchical 3D semantic scene graphs with retrieval capabilities, improving complex embodied query handling in robotic navigation.
Contribution
The paper introduces INHerit-SG, an asynchronous dual-stream architecture integrating comprehensive semantic nodes, an event-triggered update scheme, and a retrieval pipeline leveraging large language models.
Findings
Achieves state-of-the-art performance on complex semantic queries.
Effectively handles negations and chained spatial constraints.
Demonstrates robustness in real-world environments.
Abstract
Driven by recent advancements in foundation models, semantic scene graphs have emerged as a promising paradigm for high-level 3D environmental abstraction in robot navigation. However, existing frameworks struggle to successfully handle complex embodied queries while ensuring continuous semantic graph construction. To address these limitations, we present INHerit-SG, an asynchronous dual-stream architecture that systematically structures the 3D environment into a RAG-ready knowledge base. Specifically, our framework integrates comprehensive node representations, an event-triggered asynchronous update scheme, and a structured retrieval mechanism. While geometric segmentation is decoupled from semantic reasoning to maintain mapping efficiency, the semantic nodes also store natural language summaries to support text-based retrieval. Furthermore, we propose an interpretable retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
