TIGeR: A Unified Framework for Time, Images and Geo-location Retrieval
David G. Shatwell, Sirnam Swetha, Mubarak Shah

TL;DR
TIGeR is a unified framework that enables joint reasoning about visual appearance, location, and time for image retrieval, supporting complex queries involving spatial and temporal constraints.
Contribution
The paper introduces TIGeR, a novel unified model for geo-temporal image retrieval, along with a large dataset of 4.5 million triplets for training and evaluation.
Findings
TIGeR outperforms baselines by up to 16% in time-of-year prediction.
TIGeR improves time-of-day prediction accuracy by 8%.
TIGeR achieves 14% higher recall in geo-time aware retrieval.
Abstract
Many real-world applications in digital forensics, urban monitoring, and environmental analysis require jointly reasoning about visual appearance, location, and time. Beyond standard geo-localization and time-of-capture prediction, these applications increasingly demand more complex capabilities, such as retrieving an image captured at the same location as a query image but at a specified target time. We formalize this problem as Geo-Time Aware Image Retrieval and propose TIGeR, a unified framework for Time, Images and Geo-location Retrieval. TIGeR supports flexible input configurations (single-modality and multi-modality queries) and uses the same representation to perform (i) geo-localization, (ii) time-of-capture prediction, and (iii) geo-time-aware retrieval. By preserving the underlying location identity despite large appearance changes, TIGeR enables retrieval based on where and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
