GeoRC: A Benchmark for Geolocation Reasoning Chains
Mohit Talreja, Joshua Diao, Jim Thannikary James, Radu Casapu, Tejas Santanam, Ethan Mendes, Alan Ritter, Wei Xu, James Hays

TL;DR
GeoRC is a new benchmark sourced from GeoGuessr experts that evaluates the ability of vision language models to produce detailed, auditable geolocation reasoning chains, revealing current limitations of VLMs in fine-grained visual attribute extraction.
Contribution
This paper introduces GeoRC, the first benchmark for geolocation reasoning chains from expert sources, and evaluates VLMs' reasoning capabilities against human experts.
Findings
Large closed-source VLMs like Gemini and GPT-5 rival humans in location prediction.
Open-weight VLMs like Llama and Qwen perform poorly, only slightly better than random guessing.
VLMs struggle with extracting fine-grained visual attributes from high-resolution images.
Abstract
Vision Language Models (VLMs) are good at recognizing the global location of a photograph -- their geolocation prediction accuracy rivals the best human experts. But many VLMs are startlingly bad at \textit{explaining} which image evidence led to their prediction, even when their location prediction is correct. In this paper, we introduce GeoRC, the first benchmark for geolocation reasoning chains sourced directly from Champion-tier GeoGuessr experts, including the reigning world champion. This benchmark consists of 800 ``ground truth'' reasoning chains across 500 query scenes from GeoGuessr maps, with expert chains addressing hundreds of different discriminative attributes, such as soil properties, architecture, and license plate shapes. We evaluate LLM-as-a-judge and VLM-as-a-judge strategies for scoring VLM-generated reasoning chains against our expert reasoning chains and find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
