Egocentric Gesture Recognition for Head-Mounted AR devices
Tejo Chalasani, Jan Ondrej, Aljosa Smolic

TL;DR
This paper introduces a deep learning framework for egocentric gesture recognition in AR devices, utilizing a novel data augmentation method and a new dataset, achieving improved accuracy over existing methods.
Contribution
It presents an end-to-end deep learning approach combining ego-hand encoding and recurrent networks, along with a novel green screen data augmentation technique and a new gesture dataset.
Findings
Achieved higher recognition accuracy compared to state-of-the-art methods.
Developed a novel data augmentation technique using green screen capture.
Published a new dataset of egocentric gestures in controlled and natural environments.
Abstract
Natural interaction with virtual objects in AR/VR environments makes for a smooth user experience. Gestures are a natural extension from real world to augmented space to achieve these interactions. Finding discriminating spatio-temporal features relevant to gestures and hands in ego-view is the primary challenge for recognising egocentric gestures. In this work we propose a data driven end-to-end deep learning approach to address the problem of egocentric gesture recognition, which combines an ego-hand encoder network to find ego-hand features, and a recurrent neural network to discern temporally discriminating features. Since deep learning networks are data intensive, we propose a novel data augmentation technique using green screen capture to alleviate the problem of ground truth annotation. In addition we publish a dataset of 10 gestures performed in a natural fashion in front of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
