3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Iro Armeni; Zhi-Yang He; JunYoung Gwak; Amir R. Zamir; Martin Fischer,; Jitendra Malik; Silvio Savarese

arXiv:1910.02527·cs.CV·October 8, 2019

3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Iro Armeni, Zhi-Yang He, JunYoung Gwak, Amir R. Zamir, Martin Fischer,, Jitendra Malik, Silvio Savarese

PDF

1 Repo

TL;DR

This paper introduces a semi-automatic framework for constructing comprehensive 3D scene graphs that integrate diverse semantic information across objects, rooms, and cameras within a building, leveraging existing detection methods and multi-view consistency.

Contribution

It proposes a novel semi-automatic method to generate 3D scene graphs by combining 2D detection enhancements and multi-view consistency, reducing manual effort.

Findings

01

Successfully constructs 3D scene graphs with diverse semantics.

02

Enhances detection accuracy through multi-view consistency.

03

Reduces manual labor in scene graph creation.

Abstract

A comprehensive semantic understanding of a scene is important for many applications - but in what space should diverse semantic information (e.g., objects, scene categories, material types, texture, etc.) be grounded and what should be its structure? Aspiring to have one unified structure that hosts diverse types of semantics, we follow the Scene Graph paradigm in 3D, generating a 3D Scene Graph. Given a 3D mesh and registered panoramic images, we construct a graph that spans the entire building and includes semantics on objects (e.g., class, material, and other attributes), rooms (e.g., scene category, volume, etc.) and cameras (e.g., location, etc.), as well as the relationships among these entities. However, this process is prohibitively labor heavy if done manually. To alleviate this we devise a semi-automatic framework that employs existing detection methods and enhances them…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

StanfordVL/3DSceneGraph
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.