Towards Geometric and Textural Consistency 3D Scene Generation via Single Image-guided Model Generation and Layout Optimization
Xiang Tang, Ruotong Li, Xiaopeng Fan

TL;DR
This paper introduces a three-stage framework for generating coherent and detailed 3D scenes from a single image, combining object recovery, spatial geometry inference, and layout optimization.
Contribution
It presents a novel method that explicitly models geometry and texture, improving scene coherence and object detail in single-image 3D scene generation.
Findings
Outperforms state-of-the-art in geometric accuracy
Achieves higher texture fidelity in generated models
Enhances scene layout synthesis quality
Abstract
In recent years, 3D generation has made great strides in both academia and industry. However, generating 3D scenes from a single RGB image remains a significant challenge, as current approaches often struggle to ensure both object generation quality and scene coherence in multi-object scenarios. To overcome these limitations, we propose a novel three-stage framework for 3D scene generation with explicit geometric representations and high-quality textural details via single image-guided model generation and spatial layout optimization. Our method begins with an image instance segmentation and inpainting phase, which recovers missing details of occluded objects in the input images, thereby achieving complete generation of foreground 3D assets. Subsequently, our approach captures the spatial geometry of reference image by constructing pseudo-stereo viewpoint for camera parameter estimation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging
