SMTrack: State-Aware Mamba for Efficient Temporal Modeling in Visual Tracking

Yinchao Ma; Dengqing Yang; Zhangyu He; Wenfei Yang; Tianzhu Zhang

arXiv:2602.01677·cs.CV·February 3, 2026

SMTrack: State-Aware Mamba for Efficient Temporal Modeling in Visual Tracking

Yinchao Ma, Dengqing Yang, Zhangyu He, Wenfei Yang, Tianzhu Zhang

PDF

Open Access

TL;DR

SMTrack introduces a novel state-aware space model for efficient long-range temporal modeling in visual tracking, achieving robust performance with low computational costs without complex modules.

Contribution

The paper proposes SMTrack, a new paradigm that models long-range temporal dependencies efficiently using a state-aware space model, avoiding complex modules and high computational costs.

Findings

01

Achieves promising tracking performance.

02

Maintains low computational costs.

03

Facilitates long-range temporal interactions.

Abstract

Visual tracking aims to automatically estimate the state of a target object in a video sequence, which is challenging especially in dynamic scenarios. Thus, numerous methods are proposed to introduce temporal cues to enhance tracking robustness. However, conventional CNN and Transformer architectures exhibit inherent limitations in modeling long-range temporal dependencies in visual tracking, often necessitating either complex customized modules or substantial computational costs to integrate temporal cues. Inspired by the success of the state space model, we propose a novel temporal modeling paradigm for visual tracking, termed State-aware Mamba Tracker (SMTrack), providing a neat pipeline for training and tracking without needing customized modules or substantial computational costs to build long-range temporal dependencies. It enjoys several merits. First, we propose a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Gaze Tracking and Assistive Technology · Human Pose and Action Recognition