Incremental 3D Semantic Scene Graph Prediction from RGB Sequences
Shun-Cheng Wu, Keisuke Tateno, Nassir Navab, Federico Tombari

TL;DR
This paper introduces a real-time method for incrementally constructing 3D semantic scene graphs from RGB image sequences, enabling scene understanding and reasoning in real-world applications.
Contribution
It presents a novel incremental entity estimation pipeline combined with a scene graph prediction network for real-time 3D scene graph construction from RGB sequences.
Findings
Outperforms state-of-the-art methods on 3RScan dataset.
Effectively reconstructs sparse point maps and fuses multi-view entity data.
Enables real-time scene understanding with iterative message passing.
Abstract
3D semantic scene graphs are a powerful holistic representation as they describe the individual objects and depict the relation between them. They are compact high-level graphs that enable many tasks requiring scene reasoning. In real-world settings, existing 3D estimation methods produce robust predictions that mostly rely on dense inputs. In this work, we propose a real-time framework that incrementally builds a consistent 3D semantic scene graph of a scene given an RGB image sequence. Our method consists of a novel incremental entity estimation pipeline and a scene graph prediction network. The proposed pipeline simultaneously reconstructs a sparse point map and fuses entity estimation from the input images. The proposed network estimates 3D semantic scene graphs with iterative message passing using multi-view and geometric features extracted from the scene entities. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Human Pose and Action Recognition · Graph Theory and Algorithms
