Multi-Stream Keypoint Attention Network for Sign Language Recognition   and Translation

Mo Guan; Yan Wang; Guangkun Ma; Jiarui Liu; Mingzu Sun

arXiv:2405.05672·cs.CV·May 10, 2024·3 cites

Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation

Mo Guan, Yan Wang, Guangkun Ma, Jiarui Liu, Mingzu Sun

PDF

Open Access 1 Repo

TL;DR

This paper introduces MSKA-SLR, a multi-stream keypoint attention network for sign language recognition and translation that leverages keypoint sequences to improve robustness and efficiency, achieving state-of-the-art results.

Contribution

The paper proposes a novel multi-stream keypoint attention network that effectively models interactions between keypoints for sign language recognition and translation, surpassing existing methods.

Findings

01

Achieved state-of-the-art performance on Phoenix-2014T benchmark.

02

Demonstrated robustness against background fluctuations using keypoint-based inputs.

03

Validated effectiveness through extensive experiments on multiple datasets.

Abstract

Sign language serves as a non-vocal means of communication, transmitting information and significance through gestures, facial expressions, and bodily movements. The majority of current approaches for sign language recognition (SLR) and translation rely on RGB video inputs, which are vulnerable to fluctuations in the background. Employing a keypoint-based strategy not only mitigates the effects of background alterations but also substantially diminishes the computational demands of the model. Nevertheless, contemporary keypoint-based methodologies fail to fully harness the implicit knowledge embedded in keypoint sequences. To tackle this challenge, our inspiration is derived from the human cognition mechanism, which discerns sign language by analyzing the interplay between gesture configurations and supplementary elements. We propose a multi-stream keypoint attention network to depict a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sutwangyan/MSKA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Gait Recognition and Analysis · Hearing Impairment and Communication