Toward American Sign Language Processing in the Real World: Data, Tasks, and Methods
Bowen Shi

TL;DR
This paper advances real-world American Sign Language processing by introducing new datasets, tasks, and methods for fingerspelling recognition, detection, and translation in natural settings, addressing key challenges in sign language understanding.
Contribution
It presents three large-scale in-the-wild ASL datasets, novel end-to-end recognition and detection models, and a benchmark for sign language translation in realistic environments.
Findings
Conformer-based model achieves near-human performance in fingerspelling recognition.
New datasets enable robust sign language research in natural settings.
Proposed methods improve detection and search of fingerspelling segments.
Abstract
Sign language, which conveys meaning through gestures, is the chief means of communication among deaf people. Recognizing sign language in natural settings presents significant challenges due to factors such as lighting, background clutter, and variations in signer characteristics. In this thesis, I study automatic sign language processing in the wild, using signing videos collected from the Internet. This thesis contributes new datasets, tasks, and methods. Most chapters of this thesis address tasks related to fingerspelling, an important component of sign language and yet has not been studied widely by prior work. I present three new large-scale ASL datasets in the wild: ChicagoFSWild, ChicagoFSWild+, and OpenASL. Using ChicagoFSWild and ChicagoFSWild+, I address fingerspelling recognition, which consists of transcribing fingerspelling sequences into text. I propose an end-to-end…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication
