Action Quality Assessment using Transformers
Abhay Iyer, Mohammad Alali, Hemanth Bodala, Sunit Vaidya

TL;DR
This paper explores the use of Transformer models for action quality assessment in videos, demonstrating their ability to capture long-range dependencies and achieve high correlation scores, thus offering a promising alternative to convolutional methods.
Contribution
The paper introduces Transformer-based architectures for AQA, showing their effectiveness in capturing long-range dependencies and outperforming traditional convolutional approaches.
Findings
Achieved a Spearman correlation score of 0.9317.
Demonstrated the importance of hyperparameter tuning.
Paved the way for Transformer use in AQA tasks.
Abstract
Action quality assessment (AQA) is an active research problem in video-based applications that is a challenging task due to the score variance per frame. Existing methods address this problem via convolutional-based approaches but suffer from its limitation of effectively capturing long-range dependencies. With the recent advancements in Transformers, we show that they are a suitable alternative to the conventional convolutional-based architectures. Specifically, can transformer-based models solve the task of AQA by effectively capturing long-range dependencies, parallelizing computation, and providing a wider receptive field for diving videos? To demonstrate the effectiveness of our proposed architectures, we conducted comprehensive experiments and achieved a competitive Spearman correlation score of 0.9317. Additionally, we explore the hyperparameters effect on the model's performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Stroke Rehabilitation and Recovery
