CNN+RNN Depth and Skeleton based Dynamic Hand Gesture Recognition
Kenneth Lai, Svetlana N. Yanushkevich

TL;DR
This paper presents a combined CNN and RNN approach for dynamic hand gesture recognition using depth and skeleton data, achieving high accuracy and exploring data fusion techniques.
Contribution
It introduces a novel fusion of CNN and RNN for improved gesture recognition from depth and skeleton data, with comprehensive analysis of fusion methods.
Findings
Achieved 85.46% accuracy on the gesture dataset.
Demonstrated the effectiveness of combining depth and skeleton data.
Compared various data fusion techniques for optimal recognition.
Abstract
Human activity and gesture recognition is an important component of rapidly growing domain of ambient intelligence, in particular in assisting living and smart homes. In this paper, we propose to combine the power of two deep learning techniques, the convolutional neural networks (CNN) and the recurrent neural networks (RNN), for automated hand gesture recognition using both depth and skeleton data. Each of these types of data can be used separately to train neural networks to recognize hand gestures. While RNN were reported previously to perform well in recognition of sequences of movement for each skeleton joint given the skeleton information only, this study aims at utilizing depth data and apply CNN to extract important spatial information from the depth images. Together, the tandem CNN+RNN is capable of recognizing a sequence of gestures more accurately. As well, various types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
