Dynamic Subframe Splitting and Spatio-Temporal Motion Entangled Sparse Attention for RGB-E Tracking
Pengcheng Shao, Tianyang Xu, Xuefeng Zhu, Xiaojun Wu, Josef Kittler

TL;DR
This paper introduces a novel approach for RGB-E tracking that leverages dynamic subframe splitting and a specialized sparse attention mechanism to better utilize the temporal information in event streams, improving tracking performance.
Contribution
It proposes a dynamic event subframe splitting strategy and a new sparse attention mechanism tailored for event features, enhancing spatio-temporal feature extraction in RGB-E tracking.
Findings
Outperforms state-of-the-art methods on FE240 and COESOT datasets.
Effectively captures motion cues through fine-grained event clustering.
Enhances interaction of event features in both spatial and temporal dimensions.
Abstract
Event-based bionic camera asynchronously captures dynamic scenes with high temporal resolution and high dynamic range, offering potential for the integration of events and RGB under conditions of illumination degradation and fast motion. Existing RGB-E tracking methods model event characteristics utilising attention mechanism of Transformer before integrating both modalities. Nevertheless, these methods involve aggregating the event stream into a single event frame, lacking the utilisation of the temporal information inherent in the event stream.Moreover, the traditional attention mechanism is well-suited for dense semantic features, while the attention mechanism for sparse event features require revolution. In this paper, we propose a dynamic event subframe splitting strategy to split the event stream into more fine-grained event clusters, aiming to capture spatio-temporal features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · CCD and CMOS Imaging Sensors · Video Surveillance and Tracking Methods
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Dropout · Dense Connections
