Viewpoint Invariant Dense Matching for Visual Geolocalization
Gabriele Berton, Carlo Masone, Valerio Paolicelli, Barbara Caputo

TL;DR
This paper introduces GeoWarp, a novel trainable method that embeds viewpoint invariance into dense feature extraction, significantly improving visual geolocalization accuracy without requiring labeled data.
Contribution
GeoWarp is the first approach to learn viewpoint invariance directly during dense feature extraction for geolocalization, using self-supervised and weakly supervised training.
Findings
Boosts accuracy of state-of-the-art retrieval methods
Effective with unlabeled and weakly labeled data
Easily integrated into existing pipelines
Abstract
In this paper we propose a novel method for image matching based on dense local features and tailored for visual geolocalization. Dense local features matching is robust against changes in illumination and occlusions, but not against viewpoint shifts which are a fundamental aspect of geolocalization. Our method, called GeoWarp, directly embeds invariance to viewpoint shifts in the process of extracting dense features. This is achieved via a trainable module which learns from the data an invariance that is meaningful for the task of recognizing places. We also devise a new self-supervised loss and two new weakly supervised losses to train this module using only unlabeled data and weak labels. GeoWarp is implemented efficiently as a re-ranking method that can be easily embedded into pre-existing visual geolocalization pipelines. Experimental validation on standard geolocalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Robotics and Sensor-Based Localization
