TL;DR
This paper introduces OMG-Bench, a large-scale benchmark dataset for skeleton-based online micro hand gesture recognition, and proposes HMATr, a hierarchical memory transformer framework that improves detection accuracy.
Contribution
The paper presents the first large-scale micro gesture dataset and a novel hierarchical memory transformer model for improved online gesture recognition.
Findings
HMATr outperforms existing methods by 7.6% in detection rate.
OMG-Bench contains 13,948 instances of 40 fine-grained gestures.
The proposed pipeline enables automatic skeleton data generation for micro gestures.
Abstract
Online micro gesture recognition from hand skeletons is critical for VR/AR interaction but faces challenges due to limited public datasets and task-specific algorithms. Micro gestures involve subtle motion patterns, which make constructing datasets with precise skeletons and frame-level annotations difficult. To this end, we develop a multi-view self-supervised pipeline to automatically generate skeleton data, complemented by heuristic rules and expert refinement for semi-automatic annotation. Based on this pipeline, we introduce OMG-Bench, the first large-scale public benchmark for skeleton-based online micro gesture recognition. It features 40 fine-grained gesture classes with 13,948 instances across 1,272 sequences, characterized by subtle motions, rapid dynamics, and continuous execution. To tackle these challenges, we propose Hierarchical Memory-Augmented Transformer (HMATr), an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
