Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking

Shiao Wang; Xiao Wang; Duoqing Yang; Wenhao Zhang; Bo Jiang; Lin Zhu; Yonghong Tian; Bin Luo

arXiv:2605.06112·cs.CV·May 8, 2026

Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking

Shiao Wang, Xiao Wang, Duoqing Yang, Wenhao Zhang, Bo Jiang, Lin Zhu, Yonghong Tian, Bin Luo

PDF

1 Repo

TL;DR

This paper presents a novel event sparsity-aware Transformer framework for visual object tracking that models event-density variations and adapts inference depth, improving accuracy and efficiency.

Contribution

It introduces a hierarchical multi-density feature learning approach with a sparsity-aware Mixture-of-Experts and dynamic pondering for adaptive tracking.

Findings

01

Achieves a good balance between accuracy and computational efficiency.

02

Outperforms existing event-based trackers on multiple datasets.

03

Demonstrates robustness under challenging imaging conditions.

Abstract

Despite significant progress, RGB-based trackers remain vulnerable to challenging imaging conditions, such as low illumination and fast motion. Event cameras offer a promising alternative by asynchronously capturing pixel-wise brightness changes, providing high dynamic range and high temporal resolution. However, existing event-based trackers often neglect the intrinsic spatial sparsity and temporal density of event data, while relying on a single fixed temporal-window sampling strategy that is suboptimal under varying motion dynamics. In this paper, we propose an event sparsity-aware tracking framework that explicitly models event-density variations across multiple temporal scales. Specifically, the proposed framework progressively injects sparse, medium-density, and dense event search regions into a three-stage Vision Transformer backbone, enabling hierarchical multi-density feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Event-AHU/OpenEvTracking
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.