MambaLCT: Boosting Tracking via Long-term Context State Space Model
Xiaohai Li, Bineng Zhong, Qihua Liang, Guorong Li, Zhiyi Mo, Shuxiang, Song

TL;DR
MambaLCT introduces a long-term context state space model for object tracking, leveraging target variation cues from the first to the current frame to improve robustness and achieve state-of-the-art results.
Contribution
It proposes a novel unidirectional Context Mamba module that captures long-term target variation cues and integrates them into attention mechanisms for enhanced tracking.
Findings
Achieves new SOTA on six benchmarks.
Maintains real-time tracking speeds.
Enhances target perception in complex scenarios.
Abstract
Effectively constructing context information with long-term dependencies from video sequences is crucial for object tracking. However, the context length constructed by existing work is limited, only considering object information from adjacent frames or video clips, leading to insufficient utilization of contextual information. To address this issue, we propose MambaLCT, which constructs and utilizes target variation cues from the first frame to the current frame for robust tracking. First, a novel unidirectional Context Mamba module is designed to scan frame features along the temporal dimension, gathering target change cues throughout the entire sequence. Specifically, target-related information in frame features is compressed into a hidden state space through selective scanning mechanism. The target information across the entire video is continuously aggregated into target variation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsContext-Aware Activity Recognition Systems · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
MethodsSoftmax · Attention Is All You Need · Mamba: Linear-Time Sequence Modeling with Selective State Spaces
