Spatio-Temporal Matching for Siamese Visual Tracking

Jinpu Zhang; Yuehuan Wang

arXiv:2105.02408·cs.CV·May 7, 2021·1 cites

Spatio-Temporal Matching for Siamese Visual Tracking

Jinpu Zhang, Yuehuan Wang

PDF

Open Access

TL;DR

This paper introduces a novel spatio-temporal matching approach for Siamese visual tracking that leverages 4-D information, including space and time, to improve robustness and accuracy over traditional methods.

Contribution

It proposes a space-variant channel-guided correlation and an aberrance repressed module to enhance 4-D matching in object tracking, along with a new anchor-free framework.

Findings

01

Achieves state-of-the-art results on multiple benchmarks.

02

Effectively suppresses aberrances in interframe responses.

03

Enhances robustness and accuracy in challenging scenarios.

Abstract

Similarity matching is a core operation in Siamese trackers. Most Siamese trackers carry out similarity learning via cross correlation that originates from the image matching field. However, unlike 2-D image matching, the matching network in object tracking requires 4-D information (height, width, channel and time). Cross correlation neglects the information from channel and time dimensions, and thus produces ambiguous matching. This paper proposes a spatio-temporal matching process to thoroughly explore the capability of 4-D matching in space (height, width and channel) and time. In spatial matching, we introduce a space-variant channel-guided correlation (SVC-Corr) to recalibrate channel-wise feature responses for each spatial location, which can guide the generation of the target-aware matching features. In temporal matching, we investigate the time-domain context relations of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Face recognition and analysis