From Skeletons to Pixels: Few-Shot Precise Event Spotting via Representation and Prediction Distillation

Zhong Han Ervin Yeoh; and Jiang Kan

arXiv:2604.22839·cs.CV·April 28, 2026

From Skeletons to Pixels: Few-Shot Precise Event Spotting via Representation and Prediction Distillation

Zhong Han Ervin Yeoh, and Jiang Kan

PDF

TL;DR

This paper introduces two multimodal distillation strategies, AWD and AMD-FED, to improve few-shot precise event spotting in sports videos, demonstrating superior performance on tennis and figure skating datasets.

Contribution

The paper proposes novel adaptive prediction and representation distillation methods that enhance few-shot event spotting accuracy using multimodal knowledge transfer.

Findings

01

Both methods outperform single-modality baselines and prior approaches.

02

Representation-level distillation shows stronger performance in tennis.

03

AMD-FED generalizes well to figure skating dataset.

Abstract

Precise Event Spotting (PES) is essential in fast-paced sports such as tennis, where fine-grained events occur within very short temporal windows. Accurate frame-level localization is challenging because of motion blur, subtle action differences, and limited annotated data. We study two complementary distillation strategies for few-shot PES: Adaptive Weight Distillation (AWD), a prediction-level method that adaptively weights teacher supervision on unlabeled data, and Annealed Multimodal Distillation for Few-Shot Event Detection (AMD-FED), a representation-level framework that transfers robust skeleton knowledge into visual modalities through annealed pseudo-labeling. Both methods use multimodal distillation to improve generalization under limited supervision. We evaluate them on F3Set-Tennis(sub) under few-shot k-clip settings, where they consistently outperform single-modality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.