Deformable Siamese Attention Networks for Visual Object Tracking
Yuechen Yu, Yilei Xiong, Weilin Huang, Matthew R. Scott

TL;DR
This paper introduces Deformable Siamese Attention Networks (SiamAttn) for visual object tracking, utilizing deformable attention mechanisms to improve template updating and contextual feature aggregation, leading to state-of-the-art results.
Contribution
The paper proposes a novel Siamese attention mechanism with deformable self- and cross-attention for adaptive template updating and enhanced context modeling in tracking.
Findings
Achieves new state-of-the-art results on six benchmarks.
Outperforms SiamRPN++ by significant margins on VOT 2016 and 2018.
Demonstrates effective adaptive template updating through cross-attention.
Abstract
Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of the target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that computes deformable self-attention and cross-attention. The self attention learns strong context information via spatial attention, and selectively emphasizes interdependent channel-wise features with channel attention. The cross-attention is capable of aggregating rich contextual inter-dependencies between the target template and the search image, providing an implicit manner to adaptively update the target template. In addition, we design a region refinement module that computes depth-wise cross…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Deformable Siamese Attention Networks for Visual Object Tracking· youtube
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Image Enhancement Techniques
