Towards Geometric and Textural Consistency 3D Scene Generation via Single Image-guided Model Generation and Layout Optimization

Xiang Tang; Ruotong Li; Xiaopeng Fan

arXiv:2507.14841·cs.GR·February 18, 2026

Towards Geometric and Textural Consistency 3D Scene Generation via Single Image-guided Model Generation and Layout Optimization

Xiang Tang, Ruotong Li, Xiaopeng Fan

PDF

Open Access

TL;DR

This paper introduces a three-stage framework for generating coherent and detailed 3D scenes from a single image, combining object recovery, spatial geometry inference, and layout optimization.

Contribution

It presents a novel method that explicitly models geometry and texture, improving scene coherence and object detail in single-image 3D scene generation.

Findings

01

Outperforms state-of-the-art in geometric accuracy

02

Achieves higher texture fidelity in generated models

03

Enhances scene layout synthesis quality

Abstract

In recent years, 3D generation has made great strides in both academia and industry. However, generating 3D scenes from a single RGB image remains a significant challenge, as current approaches often struggle to ensure both object generation quality and scene coherence in multi-object scenarios. To overcome these limitations, we propose a novel three-stage framework for 3D scene generation with explicit geometric representations and high-quality textural details via single image-guided model generation and spatial layout optimization. Our method begins with an image instance segmentation and inpainting phase, which recovers missing details of occluded objects in the input images, thereby achieving complete generation of foreground 3D assets. Subsequently, our approach captures the spatial geometry of reference image by constructing pseudo-stereo viewpoint for camera parameter estimation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging