Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation
Antonio Finocchiaro, Giovanni Maria Farinella, Antonino Furnari

TL;DR
This paper introduces a novel calisthenics skill classification method that bypasses pose estimation by using depth estimation and athlete localization, resulting in faster inference and higher accuracy for real-time applications.
Contribution
The work presents a direct, efficient approach leveraging depth estimation and athlete patch retrieval, outperforming traditional pose-based methods in speed and accuracy.
Findings
38.3x faster inference with RGB patches
Improved classification accuracy (0.837 vs. 0.815)
Modular pipeline enabling future enhancements
Abstract
Calisthenics skill classification is the computer vision task of inferring the skill performed by an athlete from images, enabling automatic performance assessment and personalized analytics. Traditional methods for calisthenics skill recognition are based on pose estimation methods to determine the position of skeletal data from images, which is later fed to a classification algorithm to infer the performed skill. Despite the progress in human pose estimation algorithms, they still involve high computational costs, long inference times, and complex setups, which limit the applicability of such approaches in real-time applications or mobile devices. This work proposes a direct approach to calisthenics skill recognition, which leverages depth estimation and athlete patch retrieval to avoid the computationally expensive human pose estimation module. Using Depth Anything V2 for depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection
