From Pixels to Places: A Systematic Benchmark for Evaluating Image Geolocalization Ability in Large Language Models

Lingyao Li; Runlong Yu; Qikai Hu; Bowei Li; Min Deng; Yang Zhou; Xiaowei Jia

arXiv:2508.01608·cs.CV·May 19, 2026

From Pixels to Places: A Systematic Benchmark for Evaluating Image Geolocalization Ability in Large Language Models

Lingyao Li, Runlong Yu, Qikai Hu, Bowei Li, Min Deng, Yang Zhou, Xiaowei Jia

PDF

TL;DR

This paper introduces IMAGEO-Bench, a comprehensive benchmark for evaluating the image geolocalization abilities of large language models across diverse datasets and geographic regions.

Contribution

It systematically assesses LLMs' geolocalization performance, revealing biases and reasoning capabilities, and provides a new benchmark for future research.

Findings

01

Closed-source models outperform open-source ones in reasoning.

02

LLMs perform better in high-resource regions like North America and Europe.

03

Successful geolocalization relies on recognizing landmarks and urban environments.

Abstract

Image geolocalization, the task of identifying the geographic location depicted in an image, is important for applications in crisis response, digital forensics, and location-based intelligence. While recent advances in large language models (LLMs) offer new opportunities for visual reasoning, their ability to perform image geolocalization remains underexplored. In this study, we introduce a benchmark called IMAGEO-Bench that systematically evaluates accuracy, distance error, geospatial bias, and reasoning process. Our benchmark includes three diverse datasets covering global street scenes, points of interest (POIs) in the United States, and a private collection of unseen images. Through experiments on 10 state-of-the-art LLMs, including both open- and closed-source models, we reveal clear performance disparities, with closed-source models generally showing stronger reasoning.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning