Learning Sequence Descriptor based on Spatio-Temporal Attention for   Visual Place Recognition

Junqiao Zhao; Fenglin Zhang; Yingfeng Cai; Gengxuan Tian; Wenjie Mu,; Chen Ye; Tiantian Feng

arXiv:2305.11467·cs.CV·January 30, 2024·2 cites

Learning Sequence Descriptor based on Spatio-Temporal Attention for Visual Place Recognition

Junqiao Zhao, Fenglin Zhang, Yingfeng Cai, Gengxuan Tian, Wenjie Mu,, Chen Ye, Tiantian Feng

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel spatio-temporal attention-based sequence descriptor for visual place recognition, improving robustness by capturing intrinsic dynamics in frame sequences and outperforming existing methods on benchmark datasets.

Contribution

It proposes a new sequence descriptor that integrates spatial and temporal attention with relative positional encoding for enhanced VPR performance.

Findings

01

Outperforms recent state-of-the-art methods on benchmark datasets

02

Effectively captures spatio-temporal dynamics in frame sequences

03

Utilizes a sliding window and relative positional encoding for sequence modeling

Abstract

Visual Place Recognition (VPR) aims to retrieve frames from a geotagged database that are located at the same place as the query frame. To improve the robustness of VPR in perceptually aliasing scenarios, sequence-based VPR methods are proposed. These methods are either based on matching between frame sequences or extracting sequence descriptors for direct retrieval. However, the former is usually based on the assumption of constant velocity, which is difficult to hold in practice, and is computationally expensive and subject to sequence length. Although the latter overcomes these problems, existing sequence descriptors are constructed by aggregating features of multiple frames only, without interaction on temporal information, and thus cannot obtain descriptors with spatio-temporal discrimination.In this paper, we propose a sequence descriptor that effectively incorporates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tiev-tongji/spatio-temporal-seqvpr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications