Spatio-Semantic ConvNet-Based Visual Place Recognition
Luis G. Camara, Libor P\v{r}eu\v{c}il

TL;DR
This paper introduces a two-stage visual place recognition system using CNN features from a pre-trained VGG16, achieving significant improvements over existing methods on benchmark datasets, especially in challenging scenarios.
Contribution
The paper proposes a novel two-stage approach combining semantic and spatial CNN features for improved visual place recognition performance.
Findings
Outperforms state-of-the-art methods on five benchmark datasets.
Achieves more than twofold recognition improvement on challenging datasets.
Effective use of pre-trained CNN features for place recognition.
Abstract
We present a Visual Place Recognition system that follows the two-stage format common to image retrieval pipelines. The system encodes images of places by employing the activations of different layers of a pre-trained, off-the-shelf, VGG16 Convolutional Neural Network (CNN) architecture. In the first stage of our method and given a query image of a place, a number of top candidate images is retrieved from a previously stored database of places. In the second stage, we propose an exhaustive comparison of the query image against these candidates by encoding semantic and spatial information in the form of CNN features. Results from our approach outperform by a large margin state-of-the-art visual place recognition methods on five of the most commonly used benchmark datasets. The performance gain is especially remarkable on the most challenging datasets, with more than a twofold recognition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
