Multi-Focus Temporal Shifting for Precise Event Spotting in Sports Videos
Hao Xu, Xinyu Wei, Sam Wells, Sunil Aryal

TL;DR
This paper introduces the Multi-Focus Temporal Shifting Module (MFS), a lightweight, adaptable module that improves precise event spotting in sports videos by modeling multi-scale temporal dependencies, validated on a new table tennis dataset.
Contribution
The paper presents MFS, a novel multi-scale temporal shifting module that enhances existing PES models and introduces the first table tennis PES benchmark dataset.
Findings
MFS improves PES accuracy by +4.09 mAP over baselines.
MFS maintains low computational overhead with 45 GFLOPs.
Extensive experiments validate MFS's effectiveness across five benchmarks.
Abstract
Precise Event Spotting (PES) in sports videos requires frame-level recognition of fine-grained actions from single-camera footage. Existing PES models typically incorporate lightweight temporal modules such as the Gate Shift Module (GSM) or the Gate Shift Fuse to enrich 2D CNN feature extractors with temporal context. However, these modules are limited in both temporal receptive field and spatial adaptability. We propose Multi-Focus Temporal Shifting Module (MFS) that enhances GSM with multi-scale temporal shifts and Group Focus Module, enabling efficient modeling of both short and long-term dependencies while focusing on salient regions. MFS is a lightweight, plug-and-play module that integrates seamlessly with diverse 2D backbones. To further advance the field, we introduce the Table Tennis Australia dataset, the first PES benchmark for table tennis containing over 4,800 precisely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Robot Manipulation and Learning
