GaGA: Towards Interactive Global Geolocation Assistant

Zhiyang Dou; Zipeng Wang; Xumeng Han; Guorong Li; Zhipei Huang,; Zhenjun Han

arXiv:2412.08907·cs.CV·April 21, 2025

GaGA: Towards Interactive Global Geolocation Assistant

Zhiyang Dou, Zipeng Wang, Xumeng Han, Guorong Li, Zhipei Huang,, Zhenjun Han

PDF

Open Access

TL;DR

GaGA introduces an interactive, large vision-language model-based system for global image geolocation, leveraging user interaction and a new dataset to achieve state-of-the-art accuracy and explainability.

Contribution

The paper presents GaGA, a novel interactive geolocation assistant utilizing LVLMs and a new dataset, surpassing traditional methods with improved accuracy and user interaction capabilities.

Findings

01

Achieves 4.57% higher accuracy at country level

02

Improves 2.92% accuracy at city level

03

Sets new benchmark on GWS15k dataset

Abstract

Global geolocation, which seeks to predict the geographical location of images captured anywhere in the world, is one of the most challenging tasks in the field of computer vision. In this paper, we introduce an innovative interactive global geolocation assistant named GaGA, built upon the flourishing large vision-language models (LVLMs). GaGA uncovers geographical clues within images and combines them with the extensive world knowledge embedded in LVLMs to determine the geolocations while also providing justifications and explanations for the prediction results. We further designed a novel interactive geolocation method that surpasses traditional static inference approaches. It allows users to intervene, correct, or provide clues for the predictions, making the model more flexible and practical. The development of GaGA relies on the newly proposed Multi-modal Global Geolocation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Agent-Based Network Management · Mobile and Web Applications · Robotics and Automated Systems