Context-Aware Token Pruning and Discriminative Selective Attention for Transformer Tracking
Janani Kugarajeevan, Thanikasalam Kokul, Amirthalingam Ramanan, Subha Fernando

TL;DR
This paper introduces CPDATrack, a novel transformer-based tracking framework that selectively prunes background tokens and employs discriminative attention to improve tracking accuracy and efficiency, especially in cluttered scenes.
Contribution
The paper proposes a learnable token filtering and a discriminative attention mechanism to enhance transformer tracking by reducing background interference and preserving contextual information.
Findings
Achieves state-of-the-art performance on GOT-10k with 75.1% average overlap.
Effectively suppresses distractors and background tokens, improving tracking accuracy.
Enhances computational efficiency through targeted token pruning.
Abstract
One-stream Transformer-based trackers have demonstrated remarkable performance by concatenating template and search region tokens, thereby enabling joint attention across all tokens. However, enabling an excessive proportion of background search tokens to attend to the target template tokens weakens the tracker's discriminative capability. Several token pruning methods have been proposed to mitigate background interference; however, they often remove tokens near the target, leading to the loss of essential contextual information and degraded tracking performance. Moreover, the presence of distractors within the search tokens further reduces the tracker's ability to accurately identify the target. To address these limitations, we propose CPDATrack, a novel tracking framework designed to suppress interference from background and distractor tokens while enhancing computational efficiency.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gaze Tracking and Assistive Technology · Human Pose and Action Recognition
