TL;DR
CoFInAl introduces a hierarchical coarse-to-fine classification approach for Action Quality Assessment, improving interpretability and achieving state-of-the-art results on long-term datasets by better capturing subtle action cues.
Contribution
The paper proposes CoFInAl, a novel hierarchical method that aligns AQA with pre-trained tasks using coarse-to-fine classification, enhancing performance and interpretability.
Findings
Achieves state-of-the-art correlation scores on Rhythmic Gymnastics and Fis-V datasets.
Significantly improves AQA performance by 5.49% and 3.55%.
Demonstrates the effectiveness of hierarchical coarse-to-fine alignment.
Abstract
Action Quality Assessment (AQA) is pivotal for quantifying actions across domains like sports and medical care. Existing methods often rely on pre-trained backbones from large-scale action recognition datasets to boost performance on smaller AQA datasets. However, this common strategy yields suboptimal results due to the inherent struggle of these backbones to capture the subtle cues essential for AQA. Moreover, fine-tuning on smaller datasets risks overfitting. To address these issues, we propose Coarse-to-Fine Instruction Alignment (CoFInAl). Inspired by recent advances in large language model tuning, CoFInAl aligns AQA with broader pre-trained tasks by reformulating it as a coarse-to-fine classification task. Initially, it learns grade prototypes for coarse assessment and then utilizes fixed sub-grade prototypes for fine-grained assessment. This hierarchical approach mirrors the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
