GTPred: Benchmarking MLLMs for Interpretable Geo-localization and Time-of-capture Prediction
Jinnao Li, Zijian Chen, Tingzhu Chen, Changbo Wang

TL;DR
GTPred introduces a comprehensive benchmark for evaluating multi-modal large language models on geo-temporal prediction tasks, highlighting the importance of temporal data for improved geographic inference.
Contribution
The paper presents GTPred, a new benchmark dataset and evaluation framework that incorporates temporal information into geo-localization tasks for MLLMs.
Findings
Temporal information improves geo-localization accuracy.
Current MLLMs lack sufficient world knowledge and reasoning capabilities.
Benchmarking reveals limitations and potential areas for model improvement.
Abstract
Geo-localization aims to infer the geographic location where an image was captured using observable visual evidence. Traditional methods achieve impressive results through large-scale training on massive image corpora. With the emergence of multi-modal large language models (MLLMs), recent studies have explored their applications in geo-localization, benefiting from improved accuracy and interpretability. However, existing benchmarks largely ignore the temporal information inherent in images, which can further constrain the location. To bridge this gap, we introduce GTPred, a novel benchmark for geo-temporal prediction. GTPred comprises 370 globally distributed images spanning over 120 years. We evaluate MLLM predictions by jointly considering year and hierarchical location sequence matching, and further assess intermediate reasoning chains using meticulously annotated ground-truth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Indoor and Outdoor Localization Technologies · Robotics and Sensor-Based Localization
