Robust Tracking via Mamba-based Context-aware Token Learning
Jinxia Xie, Bineng Zhong, Qihua Liang, Ning Li, Zhiyi Mo, Shuxiang, Song

TL;DR
The paper introduces a simple, robust tracking method that separates temporal and appearance information learning, using a mamba-based context-aware token approach to improve efficiency and accuracy in real-time tracking.
Contribution
It proposes a novel tracker that extracts temporal relations from representative tokens, reducing computational cost and interference compared to existing methods.
Findings
Achieves competitive performance on multiple benchmarks.
Operates in real-time with efficient computation.
Effectively models temporal relations with a mamba-based module.
Abstract
How to make a good trade-off between performance and computational cost is crucial for a tracker. However, current famous methods typically focus on complicated and time-consuming learning that combining temporal and appearance information by input more and more images (or features). Consequently, these methods not only increase the model's computational source and learning burden but also introduce much useless and potentially interfering information. To alleviate the above issues, we propose a simple yet robust tracker that separates temporal information learning from appearance modeling and extracts temporal relations from a set of representative tokens rather than several images (or features). Specifically, we introduce one track token for each frame to collect the target's appearance information in the backbone. Then, we design a mamba-based Temporal Module for track tokens to be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRobotic Path Planning Algorithms
MethodsSparse Evolutionary Training · Mamba: Linear-Time Sequence Modeling with Selective State Spaces · Attentive Walk-Aggregating Graph Neural Network · Focus
