MHB: Multimodal Handshape-aware Boundary Detection for Continuous Sign Language Recognition
Mingyu Zhao, Zhanfu Yang, Yang Zhou, Zhaoyang Xia, Can Jin, Xiaoxiao He, Dimitris N. Metaxas

TL;DR
This paper introduces a multimodal approach combining skeletal features and handshape information to improve boundary detection and recognition in continuous sign language videos, achieving significant accuracy gains.
Contribution
It presents a novel multimodal fusion framework that integrates 3D skeletal and handshape features for more robust boundary detection in continuous sign language recognition.
Findings
Significant improvement over previous methods on ASLLRP corpus
Effective use of 3D skeletal features for boundary detection
Successful integration of handshape classification into segmentation pipeline
Abstract
This paper employs a multimodal approach for continuous sign recognition by first using ML for detecting the start and end frames of signs in videos of American Sign Language (ASL) sentences, and then by recognizing the segmented signs. For improved robustness we use 3D skeletal features extracted from sign language videos to take into account the convergence of sign properties and their dynamics that tend to cluster at sign boundaries. Another focus of this paper is the incorporation of information from 3D handshape for boundary detection. To detect handshapes normally expected at the beginning and end of signs, we pretrain a handshape classifier for detection of 87 linguistically defined canonical handshape categories using a dataset that we created by integrating and normalizing several existing datasets. A multimodal fusion module is then used to unify the pretrained sign video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Interactive and Immersive Displays
