Envisioning global urban development with satellite imagery and generative AI
Kailai Sun, Yuebing Liang, Mingyi He, Yunhan Zheng, Alok Prakash, Shenhao Wang, Jinhua Zhao, Alex "Sandy'' Pentland

TL;DR
This paper introduces a multimodal generative AI framework that creates realistic satellite images of urban development worldwide, enabling scenario visualization, urban planning, and style transfer across cities.
Contribution
It presents a novel AI framework integrating prompts and geospatial controls to generate diverse urban imagery and transfer urban styles globally.
Findings
Generated urban images are comparable to real images according to human experts.
The framework successfully transfers urban styles across different cities.
Latent representations improve downstream tasks like carbon emission prediction.
Abstract
Urban development has been a defining force in human history, shaping cities for centuries. However, past studies mostly analyze such development as predictive tasks, failing to reflect its generative nature. Therefore, this study designs a multimodal generative AI framework to envision sustainable urban development at a global scale. By integrating prompts and geospatial controls, our framework can generate high-fidelity, diverse, and realistic urban satellite imagery across the 500 largest metropolitan areas worldwide. It enables users to specify urban development goals, creating new images that align with them while offering diverse scenarios whose appearance can be controlled with text prompts and geospatial constraints. It also facilitates urban redevelopment practices by learning from the surrounding environment. Beyond visual synthesis, we find that it encodes and interprets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
