Transformer Tracking

Xin Chen; Bin Yan; Jiawen Zhu; Dong Wang; Xiaoyun Yang; Huchuan Lu

arXiv:2103.15436·cs.CV·March 30, 2021

Transformer Tracking

Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu

PDF

1 Repo

TL;DR

This paper introduces TransT, a novel Transformer-based tracking method that uses attention mechanisms for feature fusion, outperforming correlation-based methods and achieving high accuracy and speed on multiple benchmarks.

Contribution

The work proposes an attention-based feature fusion network for tracking, replacing correlation, and demonstrates its effectiveness with state-of-the-art results.

Findings

01

Achieves promising results on six challenging datasets.

02

Runs at approximately 50 fps on GPU.

03

Outperforms correlation-based trackers on large-scale benchmarks.

Abstract

Correlation acts as a critical role in the tracking field, especially in recent popular Siamese-based trackers. The correlation operation is a simple fusion manner to consider the similarity between the template and the search region. However, the correlation operation itself is a local linear matching process, leading to lose semantic information and fall into local optimum easily, which may be the bottleneck of designing high-accuracy tracking algorithms. Is there any better feature fusion method than correlation? To address this issue, inspired by Transformer, this work presents a novel attention-based feature fusion network, which effectively combines the template and search region features solely using attention. Specifically, the proposed method includes an ego-context augment module based on self-attention and a cross-feature augment module based on cross-attention. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chenxin-dlut/TransT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Softmax · Dense Connections · Attention Is All You Need · Dropout · Layer Normalization · Residual Connection