DASTSiam: Spatio-Temporal Fusion and Discriminative Augmentation for Improved Siamese Tracking
Yucheng Huang, Eksan Firkat, Ziwang Xiao, Jihong Zhu, Askar Hamdulla

TL;DR
DASTSiam introduces transformer-based modules to enhance Siamese tracking by improving robustness to appearance changes and discrimination, leading to better performance on public datasets.
Contribution
The paper proposes two novel transformer-based modules, ST fusion and Discriminative Augmentation, to significantly improve Siamese tracker robustness and accuracy.
Findings
Enhanced robustness against appearance changes.
Improved target discrimination in cluttered backgrounds.
Better performance demonstrated on public datasets.
Abstract
Tracking tasks based on deep neural networks have greatly improved with the emergence of Siamese trackers. However, the appearance of targets often changes during tracking, which can reduce the robustness of the tracker when facing challenges such as aspect ratio change, occlusion, and scale variation. In addition, cluttered backgrounds can lead to multiple high response points in the response map, leading to incorrect target positioning. In this paper, we introduce two transformer-based modules to improve Siamese tracking called DASTSiam: the spatio-temporal (ST) fusion module and the Discriminative Augmentation (DA) module. The ST module uses cross-attention based accumulation of historical cues to improve robustness against object appearance changes, while the DA module associates semantic information between the template and search region to improve target discrimination. Moreover,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Impact of Light on Environment and Health · Air Quality Monitoring and Forecasting
