Natural Language-Assisted Sign Language Recognition
Ronglai Zuo, Fangyun Wei, Brian Mak

TL;DR
This paper introduces NLA-SLR, a framework that leverages semantic information from glosses to improve sign language recognition, addressing visual ambiguities and enhancing model accuracy.
Contribution
It proposes language-aware label smoothing, inter-modality mixup, and a novel video-keypoint backbone to advance sign language recognition performance.
Findings
Achieves state-of-the-art results on MSASL, WLASL, and NMFs-CSL datasets.
Effectively distinguishes visually similar signs using semantic-aware techniques.
Enhances recognition accuracy by integrating multimodal features.
Abstract
Sign languages are visual languages which convey information by signers' handshape, facial expression, body movement, and so forth. Due to the inherent restriction of combinations of these visual ingredients, there exist a significant number of visually indistinguishable signs (VISigns) in sign languages, which limits the recognition capacity of vision neural networks. To mitigate the problem, we propose the Natural Language-Assisted Sign Language Recognition (NLA-SLR) framework, which exploits semantic information contained in glosses (sign labels). First, for VISigns with similar semantic meanings, we propose language-aware label smoothing by generating soft labels for each training sign whose smoothing weights are computed from the normalized semantic similarities among the glosses to ease training. Second, for VISigns with distinct semantic meanings, we present an inter-modality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Gait Recognition and Analysis
MethodsMixup · Label Smoothing
