AugmentGest: Can Random Data Cropping Augmentation Boost Gesture Recognition Performance?
Nada Aboudeshish, Dmitry Ignatov, Radu Timofte

TL;DR
This paper introduces AugmentGest, a comprehensive data augmentation framework with random cropping and geometric transformations that significantly enhances gesture recognition performance across multiple models and datasets.
Contribution
It presents a novel augmentation pipeline incorporating random cropping and transformations, improving generalization and robustness in skeleton-based and point cloud gesture recognition.
Findings
Achieved state-of-the-art results on DHG14/28 and SHREC'17 datasets.
Enhanced model robustness and diversity of gesture representations.
Proven effectiveness across multiple architectures and datasets.
Abstract
Data augmentation is a crucial technique in deep learning, particularly for tasks with limited dataset diversity, such as skeleton-based datasets. This paper proposes a comprehensive data augmentation framework that integrates geometric transformations, random cropping, rotation, zooming and intensity-based transformations, brightness and contrast adjustments to simulate real-world variations. Random cropping ensures the preservation of spatio-temporal integrity while addressing challenges such as viewpoint bias and occlusions. The augmentation pipeline generates three augmented versions for each sample in addition to the data set sample, thus quadrupling the data set size and enriching the diversity of gesture representations. The proposed augmentation strategy is evaluated on three models: multi-stream e2eET, FPPR point cloud-based hand gesture recognition (HGR), and DD-Network.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Interactive and Immersive Displays
