Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework

Zirui Song; Jingpu Yang; Yuan Huang; Jonathan Tonglet; Zeyu Zhang; Tao Cheng; Meng Fang; Iryna Gurevych; Xiuying Chen

arXiv:2502.13759·cs.CV·January 7, 2026

Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework

Zirui Song, Jingpu Yang, Yuan Huang, Jonathan Tonglet, Zeyu Zhang, Tao Cheng, Meng Fang, Iryna Gurevych, Xiuying Chen

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces a large-scale human gameplay dataset for geolocation, a novel reasoning framework called GeoCoT, and an evaluation metric, significantly advancing the accuracy and interpretability of geolocation models.

Contribution

The paper presents a comprehensive geolocation framework with a new large-scale dataset, a multi-step reasoning method, and an evaluation metric, addressing key limitations of existing approaches.

Findings

01

GeoComp dataset contains 25 million entries and 3 million geo-tagged locations.

02

GeoCoT improves geolocation accuracy by up to 25%.

03

The framework enhances interpretability of geolocation reasoning.

Abstract

Geolocation, the task of identifying an image's location, requires complex reasoning and is crucial for navigation, monitoring, and cultural preservation. However, current methods often produce coarse, imprecise, and non-interpretable localization. A major challenge lies in the quality and scale of existing geolocation datasets. These datasets are typically small-scale and automatically constructed, leading to noisy data and inconsistent task difficulty, with images that either reveal answers too easily or lack sufficient clues for reliable inference. To address these challenges, we introduce a comprehensive geolocation framework with three key components: GeoComp, a large-scale dataset; GeoCoT, a novel reasoning method; and GeoEval, an evaluation metric, collectively designed to address critical challenges and drive advancements in geolocation research. At the core of this framework is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

theeighthday/seekworld
pytorch

Models

🤗
TheEighthDay/SeekWorld_RL_PLUS
model· 3 dl· ♡ 2
3 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · Human Pose and Action Recognition