One-Shot-Learning Gesture Recognition using HOG-HOF Features
Jakub Kone\v{c}n\'y, Michal Hagara

TL;DR
This paper introduces a one-shot gesture recognition system using combined HOG and HOF features on RGB and depth data, employing novel histogram comparison and video trimming techniques to improve accuracy.
Contribution
It presents a new approach combining appearance and motion descriptors with a novel video trimming algorithm for improved gesture recognition.
Findings
Methods outperform existing published approaches
Significant narrowing of the gap between human and algorithm performance
Effective use of cross-bin histogram relationships
Abstract
The purpose of this paper is to describe one-shot-learning gesture recognition systems developed on the \textit{ChaLearn Gesture Dataset}. We use RGB and depth images and combine appearance (Histograms of Oriented Gradients) and motion descriptors (Histogram of Optical Flow) for parallel temporal segmentation and recognition. The Quadratic-Chi distance family is used to measure differences between histograms to capture cross-bin relationships. We also propose a new algorithm for trimming videos --- to remove all the unimportant frames from videos. We present two methods that use combination of HOG-HOF descriptors together with variants of Dynamic Time Warping technique. Both methods outperform other published methods and help narrow down the gap between human performance and algorithms on this task. The code has been made publicly available in the MLOSS repository.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Gait Recognition and Analysis
