Real-time on-device nod and shake recognition
Elmar H. Langholz, Reuben Brasher

TL;DR
This paper presents a method for real-time on-device recognition of head gestures like nod and shake using iPhone X depth camera data, leveraging augmented Euler angles and deep learning models for robust performance.
Contribution
It introduces a novel approach to train gesture recognition models with limited data, using data augmentation and deep neural networks like LSTM and GRU, optimized for on-device deployment.
Findings
Models trained with augmented Euler angles outperform smaller models.
Deep neural networks achieve real-time recognition on iPhone X.
Method is adaptable to other non-verbal human gestures.
Abstract
We discuss methods for teaching systems to identify gestures such as head nod and shake. We use iPhone X depth camera to gather data and later use similar data as input for a working app. These methods have proved robust for training with limited datasets and thus we make the argument that similar methods could be adapted to learn other human to human non-verbal gestures. We showcase how to augment Euler angle gesture sequences to train models with a relatively large number of parameters such as LSTM and GRU and gain better performance than reported for smaller models such as HMM. In the examples here, we demonstrate how to train such models with Keras and run the resulting models real time on device with CoreML.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Multimodal Machine Learning Applications
