Self-Evolving 3D Scene Generation from a Single Image
Kaizhi Zheng, Yue Fan, Jing Gu, Zishuo Xu, Xuehai He, Xin Eric Wang

TL;DR
EvoScene is a self-evolving, training-free framework that progressively reconstructs detailed 3D scenes from a single image by combining geometric reasoning and visual knowledge through iterative 2D-3D domain alternation.
Contribution
It introduces a novel self-evolving, training-free method that iteratively improves 3D scene reconstruction from a single image by integrating existing models in a multi-stage process.
Findings
Achieves superior geometric stability and view-consistent textures.
Effectively completes unseen regions in 3D scenes.
Produces ready-to-use 3D meshes for practical applications.
Abstract
Generating high-quality, textured 3D scenes from a single image remains a fundamental challenge in vision and graphics. Recent image-to-3D generators recover reasonable geometry from single views, but their object-centric training limits generalization to complex, large-scale scenes with faithful structure and texture. We present EvoScene, a self-evolving, training-free framework that progressively reconstructs complete 3D scenes from single images. The key idea is combining the complementary strengths of existing models: geometric reasoning from 3D generation models and visual knowledge from video generation models. Through three iterative stages--Spatial Prior Initialization, Visual-guided 3D Scene Mesh Generation, and Spatial-guided Novel View Generation--EvoScene alternates between 2D and 3D domains, gradually improving both structure and appearance. Experiments on diverse scenes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging
