Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble
Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li, Yun Fu

TL;DR
This paper introduces a novel multi-modal framework combining skeleton, RGB, and depth data for sign language recognition, achieving state-of-the-art accuracy on multiple datasets.
Contribution
It proposes a new Skeleton Aware Multi-modal Framework with a Global Ensemble Model that effectively fuses features from different modalities for improved sign language recognition.
Findings
Achieves state-of-the-art performance on three datasets.
Effectively fuses skeleton, RGB, and depth modalities.
Significantly outperforms existing methods.
Abstract
Sign language is commonly used by deaf or mute people to communicate but requires extensive effort to master. It is usually performed with the fast yet delicate movement of hand gestures, body posture, and even facial expressions. Current Sign Language Recognition (SLR) methods usually extract features via deep neural networks and suffer overfitting due to limited and noisy data. Recently, skeleton-based action recognition has attracted increasing attention due to its subject-invariant and background-invariant nature, whereas skeleton-based SLR is still under exploration due to the lack of hand annotations. Some researchers have tried to use off-line hand pose trackers to obtain hand keypoints and aid in recognizing sign language via recurrent neural networks. Nevertheless, none of them outperforms RGB-based approaches yet. To this end, we propose a novel Skeleton Aware Multi-modal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Gait Recognition and Analysis
MethodsAttentive Walk-Aggregating Graph Neural Network · Surrogate Lagrangian Relaxation · Convolution
