Automatic dense annotation of large-vocabulary sign language videos

Liliane Momeni; Hannah Bull; K R Prajwal; Samuel Albanie; G\"ul Varol,; Andrew Zisserman

arXiv:2208.02802·cs.CV·August 5, 2022·1 cites

Automatic dense annotation of large-vocabulary sign language videos

Liliane Momeni, Hannah Bull, K R Prajwal, Samuel Albanie, G\"ul Varol,, Andrew Zisserman

PDF

Open Access

TL;DR

This paper introduces a scalable framework that significantly enhances automatic sign language annotation density in videos by leveraging subtitle alignment, synonyms, pseudo-labeling, and exemplar-based methods, resulting in a large increase in annotated data.

Contribution

It presents novel methods for dense automatic annotation of sign language videos, improving previous approaches by using synonyms, pseudo-labeling, and exemplar-based techniques.

Findings

01

Increased annotations from 670K to 5M on BOBSL corpus

02

Improved annotation accuracy using subtitle-signing alignment and synonyms

03

Provided publicly available annotations to support research

Abstract

Recently, sign language researchers have turned to sign language interpreted TV broadcasts, comprising (i) a video of continuous signing and (ii) subtitles corresponding to the audio content, as a readily available and large-scale source of training data. One key challenge in the usability of such data is the lack of sign annotations. Previous work exploiting such weakly-aligned data only found sparse correspondences between keywords in the subtitle and individual signs. In this work, we propose a simple, scalable framework to vastly increase the density of automatic annotations. Our contributions are the following: (1) we significantly improve previous annotation methods by making use of synonyms and subtitle-signing alignment; (2) we show the value of pseudo-labelling from a sign recognition model as a way of sign spotting; (3) we propose a novel approach for increasing our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Human Pose and Action Recognition