AiATrack: Attention in Attention for Transformer Visual Tracking

Shenyuan Gao; Chunluan Zhou; Chao Ma; Xinggang Wang; Junsong Yuan

arXiv:2207.09603·cs.CV·July 25, 2022·1 cites

AiATrack: Attention in Attention for Transformer Visual Tracking

Shenyuan Gao, Chunluan Zhou, Chao Ma, Xinggang Wang, Junsong Yuan

PDF

Open Access 1 Repo

TL;DR

AiATrack introduces an attention-in-attention module that improves correlation accuracy in Transformer visual tracking, achieving state-of-the-art results with real-time performance.

Contribution

The paper proposes the AiA module for better correlation in attention mechanisms and a streamlined AiATrack framework utilizing feature reuse and embeddings.

Findings

01

Achieves state-of-the-art performance on six benchmarks.

02

Operates in real-time with efficient feature reuse.

03

Enhances correlation accuracy in attention mechanisms.

Abstract

Transformer trackers have achieved impressive advancements recently, where the attention mechanism plays an important role. However, the independent correlation computation in the attention mechanism could result in noisy and ambiguous attention weights, which inhibits further performance improvement. To address this issue, we propose an attention in attention (AiA) module, which enhances appropriate correlations and suppresses erroneous ones by seeking consensus among all correlation vectors. Our AiA module can be readily applied to both self-attention blocks and cross-attention blocks to facilitate feature aggregation and information propagation for visual tracking. Moreover, we propose a streamlined Transformer tracking framework, dubbed AiATrack, by introducing efficient feature reuse and target-background embeddings to make full use of temporal references. Experiments show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Little-Podi/AiATrack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Impact of Light on Environment and Health

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Dropout · Multi-Head Attention · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Layer Normalization · Adam · Residual Connection