EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion
Shang Liu, Chenjie Cao, Chaohui Yu, Wen Qian, Jing Wang, Fan Wang

TL;DR
EarthCrafter introduces a scalable framework for large-scale 3D Earth surface generation, leveraging a new extensive aerial dataset and dual sparse latent diffusion models to efficiently produce geographically plausible terrains.
Contribution
The paper presents a novel large-scale 3D Earth generation method combining a new extensive aerial dataset with a dual sparse latent diffusion architecture for efficient, high-resolution geographic modeling.
Findings
Superior performance in large-scale terrain generation
Effective separation of structure and texture modeling
Versatile applications including urban layout and terrain synthesis
Abstract
Despite the remarkable developments achieved by recent 3D generation works, scaling these methods to geographic extents, such as modeling thousands of square kilometers of Earth's surface, remains an open challenge. We address this through a dual innovation in data infrastructure and model architecture. First, we introduce Aerial-Earth3D, the largest 3D aerial dataset to date, consisting of 50k curated scenes (each measuring 600m x 600m) captured across the U.S. mainland, comprising 45M multi-view Google Earth frames. Each scene provides pose-annotated multi-view images, depth maps, normals, semantic segmentation, and camera poses, with explicit quality control to ensure terrain diversity. Building on this foundation, we propose EarthCrafter, a tailored framework for large-scale 3D Earth generation via sparse-decoupled latent diffusion. Our architecture separates structural and textural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRobotics and Sensor-Based Localization · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
