Benchmarking Badminton Action Recognition with a New Fine-Grained Dataset
Qi Li, Tzu-Chen Chiu, Hsiang-Wei Huang, Min-Te Sun, Wei-Shinn Ku

TL;DR
This paper introduces the VideoBadminton dataset, a fine-grained video dataset for badminton action recognition, and evaluates leading methods to advance sports-specific action understanding in computer vision.
Contribution
The paper presents a new detailed badminton action dataset and benchmarks existing methods, addressing the need for fine-grained sports action recognition datasets.
Findings
Leading methods show varied performance on the new dataset.
Fine-grained distinctions in badminton actions pose challenges for current models.
The dataset enables targeted research in sports action recognition.
Abstract
In the dynamic and evolving field of computer vision, action recognition has become a key focus, especially with the advent of sophisticated methodologies like Convolutional Neural Networks (CNNs), Convolutional 3D, Transformer, and spatial-temporal feature fusion. These technologies have shown promising results on well-established benchmarks but face unique challenges in real-world applications, particularly in sports analysis, where the precise decomposition of activities and the distinction of subtly different actions are crucial. Existing datasets like UCF101, HMDB51, and Kinetics have offered a diverse range of video data for various scenarios. However, there's an increasing need for fine-grained video datasets that capture detailed categorizations and nuances within broader action categories. In this paper, we introduce the VideoBadminton dataset derived from high-quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition
MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Softmax · Layer Normalization · Multi-Head Attention · Dropout · Residual Connection · Position-Wise Feed-Forward Layer · Byte Pair Encoding
