Seen2Scene: Completing Realistic 3D Scenes with Visibility-Guided Flow
Quan Meng, Yujin Chen, Lei Li, Matthias Nie{\ss}ner, Angela Dai

TL;DR
Seen2Scene is a novel flow matching-based method that trains on incomplete real-world 3D scans for realistic scene completion and generation, using visibility-guided masking and sparse transformers.
Contribution
It introduces visibility-guided flow matching for training on real incomplete scans, enabling realistic 3D scene completion without synthetic data.
Findings
Outperforms baselines in completion accuracy.
Produces coherent and realistic 3D scenes.
Effectively models complex scene structures.
Abstract
We present Seen2Scene, the first flow matching-based approach that trains directly on incomplete, real-world 3D scans for scene completion and generation. Unlike prior methods that rely on complete and hence synthetic 3D data, our approach introduces visibility-guided flow matching, which explicitly masks out unknown regions in real scans, enabling effective learning from real-world, partial observations. We represent 3D scenes using truncated signed distance field (TSDF) volumes encoded in sparse grids and employ a sparse transformer to efficiently model complex scene structures while masking unknown regions. We employ 3D layout boxes as an input conditioning signal, and our approach is flexibly adapted to various other inputs such as text or partial scans. By learning directly from real-world, incomplete 3D scans, Seen2Scene enables realistic 3D scene completion for complex, cluttered…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
