1st Place Solution to the 1st SkatingVerse Challenge
Tao Sun, Yuanzi Fu, Kaicheng Yang, Jian Wu, Ziyong Feng

TL;DR
This paper details the winning approach for the SkatingVerse Challenge, combining advanced video ROI extraction with ensemble modeling to achieve top leaderboard performance.
Contribution
It introduces a multi-model ensemble method utilizing DINO-based ROI extraction and three diverse models for improved video analysis.
Findings
Achieved 95.73% leaderboard score
Effective ROI cropping with DINO framework
Successful ensemble of three models
Abstract
This paper presents the winning solution for the 1st SkatingVerse Challenge. We propose a method that involves several steps. To begin, we leverage the DINO framework to extract the Region of Interest (ROI) and perform precise cropping of the raw video footage. Subsequently, we employ three distinct models, namely Unmasked Teacher, UniformerV2, and InfoGCN, to capture different aspects of the data. By ensembling the prediction results based on logits, our solution attains an impressive leaderboard score of 95.73%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Dense Connections · Residual Connection · Softmax · Vision Transformer · self-DIstillation with NO labels
