A Faster, Lighter and Stronger Deep Learning-Based Approach for Place   Recognition

Rui Huang; Ze Huang; Songzhi Su

arXiv:2211.14864·cs.CV·November 29, 2022

A Faster, Lighter and Stronger Deep Learning-Based Approach for Place Recognition

Rui Huang, Ze Huang, Songzhi Su

PDF

Open Access

TL;DR

This paper introduces a novel deep learning approach for visual place recognition that is faster, lighter, and more accurate, utilizing a new backbone network and a trainable feature matcher to outperform existing methods.

Contribution

The authors propose RepVGG-lite as a new backbone network and a trainable attention-based feature matcher, significantly reducing model size and inference time while improving accuracy in place recognition.

Findings

01

14x fewer parameters than Patch-NetVLAD

02

6.8x lower FLOPs than Patch-NetVLAD

03

0.5% higher Recall@1 than Patch-NetVLAD

Abstract

Visual Place Recognition is an essential component of systems for camera localization and loop closure detection, and it has attracted widespread interest in multiple domains such as computer vision, robotics and AR/VR. In this work, we propose a faster, lighter and stronger approach that can generate models with fewer parameters and can spend less time in the inference stage. We designed RepVGG-lite as the backbone network in our architecture, it is more discriminative than other general networks in the Place Recognition task. RepVGG-lite has more speed advantages while achieving higher performance. We extract only one scale patch-level descriptors from global descriptors in the feature extraction stage. Then we design a trainable feature matcher to exploit both spatial relationships of the features and their visual appearance, which is based on the attention mechanism. Comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Automated Road and Building Extraction

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings