Word-level Sign Language Recognition with Multi-stream Neural Networks Focusing on Local Regions and Skeletal Information
Mizuki Maruyama, Shrey Singh, Katsufumi Inoue, Partha Pratim Roy,, Masakazu Iwamura, Michifumi Yoshioka

TL;DR
This paper introduces a multi-stream neural network for word-level sign language recognition that leverages local regions and skeletal data, significantly improving accuracy over traditional action recognition methods.
Contribution
The paper proposes a novel multi-stream neural network architecture tailored for WSLR, integrating different data types to enhance recognition performance.
Findings
Achieved 10-15% higher Top-1 accuracy on WLASL and MS-ASL datasets.
Demonstrated the effectiveness of combining local image, skeleton, and base movement data.
Showed that task-specific models outperform generic action recognition approaches.
Abstract
Word-level sign language recognition (WSLR) has attracted attention because it is expected to overcome the communication barrier between people with speech impairment and those who can hear. In the WSLR problem, a method designed for action recognition has achieved the state-of-the-art accuracy. Indeed, it sounds reasonable for an action recognition method to perform well on WSLR because sign language is regarded as an action. However, a careful evaluation of the tasks reveals that the tasks of action recognition and WSLR are inherently different. Hence, in this paper, we propose a novel WSLR method that takes into account information specifically useful for the WSLR problem. We realize it as a multi-stream neural network (MSNN), which consist of three streams: 1) base stream, 2) local image stream, and 3) skeleton stream. Each stream is designed to handle different types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Gait Recognition and Analysis
