Towards Automatic Speech to Sign Language Generation

Parul Kapoor; Rudrabha Mukhopadhyay; Sindhu B Hegde; Vinay Namboodiri,; C V Jawahar

arXiv:2106.12790·cs.CV·June 25, 2021

Towards Automatic Speech to Sign Language Generation

Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B Hegde, Vinay Namboodiri,, C V Jawahar

PDF

Open Access 1 Repo

TL;DR

This paper introduces the first system to generate continuous sign language videos directly from speech, utilizing a new dataset and a multi-tasking transformer model to improve naturalness and accuracy.

Contribution

It presents a novel end-to-end approach for speech-to-sign language generation without relying on text, supported by a new Indian sign language dataset with speech annotations.

Findings

01

Effective sign pose sequence generation demonstrated

02

Multi-tasking transformer outperforms baselines

03

Ablation studies highlight key module contributions

Abstract

We aim to solve the highly challenging task of generating continuous sign language videos solely from speech segments for the first time. Recent efforts in this space have focused on generating such videos from human-annotated text transcripts without considering other modalities. However, replacing speech with sign language proves to be a practical solution while communicating with people suffering from hearing loss. Therefore, we eliminate the need of using text as input and design techniques that work for more natural, continuous, freely uttered speech covering an extensive vocabulary. Since the current datasets are inadequate for generating sign language directly from speech, we collect and release the first Indian sign language dataset comprising speech-level annotations, text transcripts, and the corresponding sign-language videos. Next, we propose a multi-tasking transformer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kapoorparul/Towards-Automatic-Speech-to-SL
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Hearing Impairment and Communication