Learning Semantics for Visual Place Recognition through Multi-Scale Attention
Valerio Paolicelli, Antonio Tavera, Carlo Masone, Gabriele Berton,, Barbara Caputo

TL;DR
This paper introduces a novel visual place recognition method that learns robust global embeddings by integrating semantic and appearance information through a multi-scale attention mechanism, improving accuracy in large-scale geotagged datasets.
Contribution
It presents the first VPR algorithm that dynamically guides segmentation with multi-scale attention, combining semantic and visual features for enhanced place recognition.
Findings
Outperforms state-of-the-art methods on various scenarios
Demonstrates robustness in large-scale geotagged datasets
Introduces a synthetic dataset for place recognition and segmentation
Abstract
In this paper we address the task of visual place recognition (VPR), where the goal is to retrieve the correct GPS coordinates of a given query image against a huge geotagged gallery. While recent works have shown that building descriptors incorporating semantic and appearance information is beneficial, current state-of-the-art methods opt for a top down definition of the significant semantic content. Here we present the first VPR algorithm that learns robust global embeddings from both visual appearance and semantic content of the data, with the segmentation process being dynamically guided by the recognition of places through a multi-scale attention module. Experiments on various scenarios validate this new approach and demonstrate its performance against state-of-the-art methods. Finally, we propose the first synthetic-world dataset suited for both place recognition and segmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Indoor and Outdoor Localization Technologies
MethodsGreedy Policy Search
