VIFSS: View-Invariant and Figure Skating-Specific Pose Representation Learning for Temporal Action Segmentation

Ryota Tanaka; Tomohiro Suzuki; Keisuke Fujii

arXiv:2508.10281·cs.CV·August 15, 2025

VIFSS: View-Invariant and Figure Skating-Specific Pose Representation Learning for Temporal Action Segmentation

Ryota Tanaka, Tomohiro Suzuki, Keisuke Fujii

PDF

TL;DR

This paper introduces VIFSS, a novel view-invariant pose representation learning method combined with a new dataset and annotation scheme to improve temporal action segmentation of figure skating jumps, addressing data scarcity and 3D complexity.

Contribution

The work presents a new pose representation learning approach, a specialized 3D dataset, and a fine-grained annotation scheme for better action segmentation in figure skating.

Findings

01

Achieves over 92% F1@50 on element-level TAS

02

View-invariant contrastive pre-training enhances performance with limited data

03

Effectively models procedural structure of jumps

Abstract

Understanding human actions from videos plays a critical role across various domains, including sports analytics. In figure skating, accurately recognizing the type and timing of jumps a skater performs is essential for objective performance evaluation. However, this task typically requires expert-level knowledge due to the fine-grained and complex nature of jump procedures. While recent approaches have attempted to automate this task using Temporal Action Segmentation (TAS), there are two major limitations to TAS for figure skating: the annotated data is insufficient, and existing methods do not account for the inherent three-dimensional aspects and procedural structure of jump actions. In this work, we propose a new TAS framework for figure skating jumps that explicitly incorporates both the three-dimensional nature and the semantic procedure of jump movements. First, we propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.