TrackNetV5: Residual-Driven Spatio-Temporal Refinement and Motion Direction Decoupling for Fast Object Tracking
Haonan Tang, Yanjun Chen, Lezhi Jiang, Qianfei Li, Xinyu Guo

TL;DR
TrackNetV5 introduces novel motion direction decoupling and residual spatio-temporal refinement modules, significantly improving small object tracking accuracy and occlusion handling in sports videos while maintaining real-time performance.
Contribution
The paper presents two new mechanisms, MDD and R-STR, that enhance motion understanding and occlusion recovery in a fast object tracking architecture.
Findings
Achieves state-of-the-art F1-score of 0.9859 on TrackNetV2 dataset.
Improves tracking accuracy with only 3.7% increase in FLOPs.
Effectively handles occlusions and directional ambiguities in fast-moving objects.
Abstract
The TrackNet series has established a strong baseline for fast-moving small object tracking in sports. However, existing iterations face significant limitations: V1-V3 struggle with occlusions due to a reliance on purely visual cues, while TrackNetV4, despite introducing motion inputs, suffers from directional ambiguity as its absolute difference method discards motion polarity. To overcome these bottlenecks, we propose TrackNetV5, a robust architecture integrating two novel mechanisms. First, to recover lost directional priors, we introduce the Motion Direction Decoupling (MDD) module. Unlike V4, MDD decomposes temporal dynamics into signed polarity fields, explicitly encoding both movement occurrence and trajectory direction. Second, we propose the Residual-Driven Spatio-Temporal Refinement (R-STR) head. Operating on a coarse-to-fine paradigm, this Transformer-based module leverages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Gaze Tracking and Assistive Technology
