Envisioning global urban development with satellite imagery and generative AI

Kailai Sun; Yuebing Liang; Mingyi He; Yunhan Zheng; Alok Prakash; Shenhao Wang; Jinhua Zhao; Alex "Sandy'' Pentland

arXiv:2603.26831·cs.CV·March 31, 2026

Envisioning global urban development with satellite imagery and generative AI

Kailai Sun, Yuebing Liang, Mingyi He, Yunhan Zheng, Alok Prakash, Shenhao Wang, Jinhua Zhao, Alex "Sandy'' Pentland

PDF

TL;DR

This paper introduces a multimodal generative AI framework that creates realistic satellite images of urban development worldwide, enabling scenario visualization, urban planning, and style transfer across cities.

Contribution

It presents a novel AI framework integrating prompts and geospatial controls to generate diverse urban imagery and transfer urban styles globally.

Findings

01

Generated urban images are comparable to real images according to human experts.

02

The framework successfully transfers urban styles across different cities.

03

Latent representations improve downstream tasks like carbon emission prediction.

Abstract

Urban development has been a defining force in human history, shaping cities for centuries. However, past studies mostly analyze such development as predictive tasks, failing to reflect its generative nature. Therefore, this study designs a multimodal generative AI framework to envision sustainable urban development at a global scale. By integrating prompts and geospatial controls, our framework can generate high-fidelity, diverse, and realistic urban satellite imagery across the 500 largest metropolitan areas worldwide. It enables users to specify urban development goals, creating new images that align with them while offering diverse scenarios whose appearance can be controlled with text prompts and geospatial constraints. It also facilitates urban redevelopment practices by learning from the surrounding environment. Beyond visual synthesis, we find that it encodes and interprets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.