MambaEVT: Event Stream based Visual Object Tracking using State Space   Model

Xiao Wang; Chao wang; Shiao Wang; Xixi Wang; Zhicheng Zhao; Lin Zhu,; Bo Jiang

arXiv:2408.10487·cs.CV·August 21, 2024

MambaEVT: Event Stream based Visual Object Tracking using State Space Model

Xiao Wang, Chao wang, Shiao Wang, Xixi Wang, Zhicheng Zhao, Lin Zhu,, Bo Jiang

PDF

Open Access 1 Repo

TL;DR

MambaEVT introduces a novel event-based visual tracking framework utilizing a state space model and dynamic template updates, achieving improved accuracy and efficiency on large-scale datasets.

Contribution

The paper proposes a Mamba-based tracking framework with a state space model and dynamic template update strategy, addressing performance bottlenecks of existing event-based trackers.

Findings

01

Achieves a good balance between accuracy and computational cost.

02

Performs well on multiple large-scale datasets.

03

Introduces a dynamic template update mechanism.

Abstract

Event camera-based visual tracking has drawn more and more attention in recent years due to the unique imaging principle and advantages of low energy consumption, high dynamic range, and dense temporal resolution. Current event-based tracking algorithms are gradually hitting their performance bottlenecks, due to the utilization of vision Transformer and the static template for target object localization. In this paper, we propose a novel Mamba-based visual tracking framework that adopts the state space model with linear complexity as a backbone network. The search regions and target template are fed into the vision Mamba network for simultaneous feature extraction and interaction. The output tokens of search regions will be fed into the tracking head for target localization. More importantly, we consider introducing a dynamic template update strategy into the tracking framework using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

event-ahu/mambaevt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Data Stream Mining Techniques

MethodsAttention Is All You Need · Linear Layer · Residual Connection · Multi-Head Attention · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Dense Connections · Byte Pair Encoding · Absolute Position Encodings