VIFSS: View-Invariant and Figure Skating-Specific Pose Representation Learning for Temporal Action Segmentation
Ryota Tanaka, Tomohiro Suzuki, Keisuke Fujii

TL;DR
This paper introduces VIFSS, a novel view-invariant pose representation learning method combined with a new dataset and annotation scheme to improve temporal action segmentation of figure skating jumps, addressing data scarcity and 3D complexity.
Contribution
The work presents a new pose representation learning approach, a specialized 3D dataset, and a fine-grained annotation scheme for better action segmentation in figure skating.
Findings
Achieves over 92% F1@50 on element-level TAS
View-invariant contrastive pre-training enhances performance with limited data
Effectively models procedural structure of jumps
Abstract
Understanding human actions from videos plays a critical role across various domains, including sports analytics. In figure skating, accurately recognizing the type and timing of jumps a skater performs is essential for objective performance evaluation. However, this task typically requires expert-level knowledge due to the fine-grained and complex nature of jump procedures. While recent approaches have attempted to automate this task using Temporal Action Segmentation (TAS), there are two major limitations to TAS for figure skating: the annotated data is insufficient, and existing methods do not account for the inherent three-dimensional aspects and procedural structure of jump actions. In this work, we propose a new TAS framework for figure skating jumps that explicitly incorporates both the three-dimensional nature and the semantic procedure of jump movements. First, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
