Bootstrapping Sign Language Annotations with Sign Language Models

Colin Lea; Vasileios Baltatzis; Connor Gillis; Raja Kushalnagar; Lorna Quandt; Leah Findlater

arXiv:2604.07606·cs.CV·April 10, 2026

Bootstrapping Sign Language Annotations with Sign Language Models

Colin Lea, Vasileios Baltatzis, Connor Gillis, Raja Kushalnagar, Lorna Quandt, Leah Findlater

PDF

TL;DR

This paper introduces a pseudo-annotation pipeline leveraging sign language models and large language models to automatically generate annotations for sign language videos, facilitating dataset utilization and model training.

Contribution

It presents a novel pipeline combining recognizers and LLMs for pseudo-annotation, along with establishing baseline models and releasing a new annotated dataset.

Findings

01

Achieved state-of-the-art 6.7% CER on FSBoard.

02

Achieved 74% top-1 accuracy on ASL Citizen datasets.

03

Annotated nearly 500 videos with sequence-level gloss labels.

Abstract

AI-driven sign language interpretation is limited by a lack of high-quality annotated data. New datasets including ASL STEM Wiki and FLEURS-ASL contain professional interpreters and 100s of hours of data but remain only partially annotated and thus underutilized, in part due to the prohibitive costs of annotating at this scale. In this work, we develop a pseudo-annotation pipeline that takes signed video and English as input and outputs a ranked set of likely annotations, including time intervals, for glosses, fingerspelled words, and sign classifiers. Our pipeline uses sparse predictions from our fingerspelling recognizer and isolated sign recognizer (ISR), along with a K-Shot LLM approach, to estimate these annotations. In service of this pipeline, we establish simple yet effective baseline fingerspelling and ISR models, achieving state-of-the-art on FSBoard (6.7% CER) and on ASL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.