Boosting Gesture Recognition with an Automatic Gesture Annotation Framework
Junxiao Shen, Xuhai Xu, Ran Tan, Amy Karlson, Evan Strasnick

TL;DR
This paper introduces an automatic gesture annotation framework that leverages CTC loss and semi-supervised learning to improve gesture recognition accuracy and reduce manual labeling efforts.
Contribution
The proposed framework automatically annotates gestures and enhances downstream model performance using pseudo labels, advancing gesture recognition training methods.
Findings
Gesture classification accuracy improved by 3-4%.
Localization accuracy increased by 71-75%.
Downstream model accuracy boosted by 11-18%.
Abstract
Training a real-time gesture recognition model heavily relies on annotated data. However, manual data annotation is costly and demands substantial human effort. In order to address this challenge, we propose a framework that can automatically annotate gesture classes and identify their temporal ranges. Our framework consists of two key components: (1) a novel annotation model that leverages the Connectionist Temporal Classification (CTC) loss, and (2) a semi-supervised learning pipeline that enables the model to improve its performance by training on its own predictions, known as pseudo labels. These high-quality pseudo labels can also be used to enhance the accuracy of other downstream gesture recognition models. To evaluate our framework, we conducted experiments using two publicly available gesture datasets. Our ablation study demonstrates that our annotation model design surpasses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Hearing Impairment and Communication
