American Sign Language fingerspelling recognition from video: Methods for unrestricted recognition and signer-independence
Taehwan Kim

TL;DR
This paper investigates American Sign Language fingerspelling recognition from video, proposing methods for both signer-dependent and signer-independent scenarios, achieving notable error rate reductions with neural network-based models.
Contribution
It introduces segmental CRF models with deep neural features and explores signer adaptation techniques for improved recognition accuracy.
Findings
Signer-dependent error rate: up to 8%
Signer-independent error rate: up to 17%
Neural network adaptation improves signer-independent recognition
Abstract
In this thesis, we study the problem of recognizing video sequences of fingerspelled letters in American Sign Language (ASL). Fingerspelling comprises a significant but relatively understudied part of ASL, and recognizing it is challenging for a number of reasons: It involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected. In this work, we propose several types of recognition approaches, and explore the signer variation problem. Our best-performing models are segmental (semi-Markov) conditional random fields using deep neural network-based features. In the signer-dependent setting, our recognizers achieve up to about 8% letter error rates. The signer-independent setting is much more challenging, but with neural network adaptation we achieve up to 17%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Gait Recognition and Analysis
