SGR3 Model: Scene Graph Retrieval-Reasoning Model in 3D
Zirui Wang, Ruiping Liu, Yufan Chen, Junwei Zheng, Weijia Fan, Kunyu Peng, Di Wen, Jiale Wei, Jiaming Zhang, Rainer Stiefelhagen

TL;DR
The SGR3 Model introduces a training-free, retrieval-augmented framework for 3D scene graph generation that leverages large language models and cross-modal retrieval, bypassing explicit 3D reconstruction.
Contribution
It presents a novel, training-free approach combining large language models with retrieval-augmented generation for 3D scene graph reasoning.
Findings
Achieves competitive performance with training-free baselines.
Performs on par with GNN-based models.
Retrieval-based external information is explicitly integrated into reasoning.
Abstract
3D scene graphs provide a structured representation of object entities and their relationships, enabling high-level interpretation and reasoning for robots while remaining intuitively understandable to humans. Existing approaches for 3D scene graph generation typically combine scene reconstruction with graph neural networks (GNNs). However, such pipelines require multi-modal data that may not always be available, and their reliance on heuristic graph construction can constrain the prediction of relationship triplets. In this work, we introduce a Scene Graph Retrieval-Reasoning Model in 3D (SGR3 Model), a training-free framework that leverages multi-modal large language models (MLLMs) with retrieval-augmented generation (RAG) for semantic scene graph generation. SGR3 Model bypasses the need for explicit 3D reconstruction. Instead, it enhances relational reasoning by incorporating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Graph Theory and Algorithms
