Locatability-Guided Adaptive Reasoning for Image Geo-Localization with Vision-Language Models
Bo Yu, Fengze Yang, Yiming Liu, Chao Wang, Xuewen Luo, Taozhe Li, Ruimin Ke, Xiaofan Zhou, Chenxi Liu

TL;DR
This paper introduces a new adaptive reasoning framework for image geo-localization using vision-language models, which improves accuracy and reduces hallucinations by dynamically adjusting reasoning depth based on image locatability.
Contribution
It proposes an Optimized Locatability Score, a new reasoning dataset, and a two-stage adaptive policy for improved geo-localization performance.
Findings
Achieves state-of-the-art results on multiple benchmarks.
Reduces hallucinations in reasoning processes.
Demonstrates effective adaptive reasoning depth control.
Abstract
The emergence of Vision-Language Models (VLMs) has introduced new paradigms for global image geo-localization through retrieval-augmented generation (RAG) and reasoning-driven inference. However, RAG methods are constrained by retrieval database quality, while reasoning-driven approaches fail to internalize image locatability, relying on inefficient, fixed-depth reasoning paths that increase hallucinations and degrade accuracy. To overcome these limitations, we introduce an Optimized Locatability Score that quantifies an image's suitability for deep reasoning in geo-localization. Using this metric, we curate Geo-ADAPT-51K, a locatability-stratified reasoning dataset enriched with augmented reasoning trajectories for complex visual scenes. Building on this foundation, we propose a two-stage Group Relative Policy Optimization (GRPO) curriculum with customized reward functions that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
