OpenStreetView-5M: The Many Roads to Global Visual Geolocation
Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis, Constantin, Aronssohn, Nacim Bouia, Stephanie Fu, Romain Loiseau, Van Nguyen Nguyen,, Charles Raude, Elliot Vincent, Lintao XU, Hongyu Zhou, Loic Landrieu

TL;DR
OpenStreetView-5M is a large, open-access dataset of over 5 million geo-referenced street view images from around the world, enabling more robust evaluation of visual geolocation algorithms.
Contribution
The paper introduces OpenStreetView-5M, a comprehensive dataset with strict train/test separation to improve evaluation of geolocation models.
Findings
Benchmarking of state-of-the-art encoders and strategies.
Dataset covers 225 countries and territories.
Supports evaluation beyond memorization.
Abstract
Determining the location of an image anywhere on Earth is a complex visual task, which makes it particularly relevant for evaluating computer vision algorithms. Yet, the absence of standard, large-scale, open-access datasets with reliably localizable images has limited its potential. To address this issue, we introduce OpenStreetView-5M, a large-scale, open-access dataset comprising over 5.1 million geo-referenced street view images, covering 225 countries and territories. In contrast to existing benchmarks, we enforce a strict train/test separation, allowing us to evaluate the relevance of learned geographical features beyond mere memorization. To demonstrate the utility of our dataset, we conduct an extensive benchmark of various state-of-the-art image encoders, spatial representations, and training strategies. All associated codes and models can be found at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Multimodal Machine Learning Applications
