GPS as a Control Signal for Image Generation

Chao Feng; Ziyang Chen; Aleksander Holynski; Alexei A. Efros; Andrew; Owens

arXiv:2501.12390·cs.CV·January 23, 2025

GPS as a Control Signal for Image Generation

Chao Feng, Ziyang Chen, Aleksander Holynski, Alexei A. Efros, Andrew, Owens

PDF

Open Access

TL;DR

This paper demonstrates that GPS metadata can be used as a control signal for generating location-specific images and 3D reconstructions, enhancing the realism and spatial accuracy of generated visuals.

Contribution

It introduces GPS-conditioned image generation models, including diffusion models and 3D reconstructions, showing how GPS data improves spatially-aware image synthesis.

Findings

01

GPS conditioning captures neighborhood-specific appearances

02

Improves accuracy of 3D structure estimation

03

Enables fine-grained cityscape image generation

Abstract

We show that the GPS tags contained in photo metadata provide a useful control signal for image generation. We train GPS-to-image models and use them for tasks that require a fine-grained understanding of how images vary within a city. In particular, we train a diffusion model to generate images conditioned on both GPS and text. The learned model generates images that capture the distinctive appearance of different neighborhoods, parks, and landmarks. We also extract 3D models from 2D GPS-to-image models through score distillation sampling, using GPS conditioning to constrain the appearance of the reconstruction from each viewpoint. Our evaluations suggest that our GPS-conditioned models successfully learn to generate images that vary based on location, and that GPS conditioning improves estimated 3D structure.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInertial Sensor and Navigation

MethodsDiffusion · Greedy Policy Search