Extend3D: Town-Scale 3D Generation

Seungwoo Yoon; Jinmo Kim; Jaesik Park

arXiv:2603.29387·cs.CV·April 1, 2026

Extend3D: Town-Scale 3D Generation

Seungwoo Yoon, Jinmo Kim, Jaesik Park

PDF

1 Repo

TL;DR

Extend3D introduces a novel, training-free pipeline for generating large-scale 3D scenes from a single image by extending and dividing the latent space, refining patches, and optimizing for better structure and texture fidelity.

Contribution

The paper presents a new method that extends object-centric 3D generative models to scene-scale generation without training, using patch-wise generation and 3D-aware optimization.

Findings

01

Outperforms prior methods in human preference tests.

02

Achieves higher quantitative scores in 3D scene generation.

03

Effectively completes 3D structures via under-noising technique.

Abstract

In this paper, we propose Extend3D, a training-free pipeline for 3D scene generation from a single image, built upon an object-centric 3D generative model. To overcome the limitations of fixed-size latent spaces in object-centric models for representing wide scenes, we extend the latent space in the $x$ and $y$ directions. Then, by dividing the extended latent space into overlapping patches, we apply the object-centric 3D generative model to each patch and couple them at each time step. Since patch-wise 3D generation with image conditioning requires strict spatial alignment between image and latent patches, we initialize the scene using a point cloud prior from a monocular depth estimator and iteratively refine occluded regions through SDEdit. We discovered that treating the incompleteness of 3D structure as noise during 3D refinement enables 3D completion via a concept, which we term…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

snu-vgilab/Extend3D
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.