SceneLinker: Compositional 3D Scene Generation via Semantic Scene Graph from RGB Sequences

Seok-Young Kim; Dooyoung Kim; Woojin Cho; Hail Song; Suji Kang; Woontack Woo

arXiv:2602.02974·cs.CV·February 4, 2026

SceneLinker: Compositional 3D Scene Generation via Semantic Scene Graph from RGB Sequences

Seok-Young Kim, Dooyoung Kim, Woojin Cho, Hail Song, Suji Kang, Woontack Woo

PDF

Open Access

TL;DR

SceneLinker is a new framework that generates realistic 3D scenes from RGB sequences using semantic scene graphs, improving the alignment of virtual content with real-world object arrangements for mixed reality applications.

Contribution

It introduces a novel graph network with cross-check feature attention and a graph-VAE for accurate 3D scene generation from RGB data, addressing limitations of prior methods.

Findings

01

Outperforms state-of-the-art methods on 3RScan/3DSSG and SG-FRONT datasets.

02

Effective in complex indoor environments with scene graph constraints.

03

Enables consistent 3D space creation from physical environments for MR.

Abstract

We introduce SceneLinker, a novel framework that generates compositional 3D scenes via semantic scene graph from RGB sequences. To adaptively experience Mixed Reality (MR) content based on each user's space, it is essential to generate a 3D scene that reflects the real-world layout by compactly capturing the semantic cues of the surroundings. Prior works struggled to fully capture the contextual relationship between objects or mainly focused on synthesizing diverse shapes, making it challenging to generate 3D scenes aligned with object arrangements. We address these challenges by designing a graph network with cross-check feature attention for scene graph prediction and constructing a graph-variational autoencoder (graph-VAE), which consists of a joint shape and layout block for 3D scene generation. Experiments on the 3RScan/3DSSG and SG-FRONT datasets demonstrate that our approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Face recognition and analysis · Generative Adversarial Networks and Image Synthesis