TL;DR
This paper introduces a hybrid deep learning approach that combines attention mechanisms with few-shot unsupervised domain adaptation to improve cross-domain visual place recognition, achieving superior performance with minimal target data.
Contribution
It presents a novel hybrid method that enhances domain robustness in visual geolocalization using limited unlabeled target domain images, and introduces the SVOX dataset.
Findings
Outperforms state-of-the-art methods with fewer target images
Utilizes attention mechanism and few-shot adaptation effectively
Provides a new large-scale dataset for cross-domain geolocalization
Abstract
We address the task of cross-domain visual place recognition, where the goal is to geolocalize a given query image against a labeled gallery, in the case where the query and the gallery belong to different visual domains. To achieve this, we focus on building a domain robust deep network by leveraging over an attention mechanism combined with few-shot unsupervised domain adaptation techniques, where we use a small number of unlabeled target domain images to learn about the target distribution. With our method, we are able to outperform the current state of the art while using two orders of magnitude less target domain images. Finally we propose a new large-scale dataset for cross-domain visual place recognition, called SVOX. The pytorch code is available at https://github.com/valeriopaolicelli/AdAGeo .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
