Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble

Songyao Jiang; Bin Sun; Lichen Wang; Yue Bai; Kunpeng Li; Yun Fu

arXiv:2110.06161·cs.CV·October 13, 2021·31 cites

Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble

Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li, Yun Fu

PDF

Open Access 2 Repos

TL;DR

This paper introduces a novel multi-modal framework combining skeleton, RGB, and depth data for sign language recognition, achieving state-of-the-art accuracy on multiple datasets.

Contribution

It proposes a new Skeleton Aware Multi-modal Framework with a Global Ensemble Model that effectively fuses features from different modalities for improved sign language recognition.

Findings

01

Achieves state-of-the-art performance on three datasets.

02

Effectively fuses skeleton, RGB, and depth modalities.

03

Significantly outperforms existing methods.

Abstract

Sign language is commonly used by deaf or mute people to communicate but requires extensive effort to master. It is usually performed with the fast yet delicate movement of hand gestures, body posture, and even facial expressions. Current Sign Language Recognition (SLR) methods usually extract features via deep neural networks and suffer overfitting due to limited and noisy data. Recently, skeleton-based action recognition has attracted increasing attention due to its subject-invariant and background-invariant nature, whereas skeleton-based SLR is still under exploration due to the lack of hand annotations. Some researchers have tried to use off-line hand pose trackers to obtain hand keypoints and aid in recognizing sign language via recurrent neural networks. Nevertheless, none of them outperforms RGB-based approaches yet. To this end, we propose a novel Skeleton Aware Multi-modal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Gait Recognition and Analysis

MethodsAttentive Walk-Aggregating Graph Neural Network · Surrogate Lagrangian Relaxation · Convolution