Everybody Sign Now: Translating Spoken Language to Photo Realistic Sign Language Video
Ben Saunders, Necati Cihan Camgoz, Richard Bowden

TL;DR
This paper introduces SignGAN, a novel neural model that translates spoken language into photo-realistic sign language videos, improving realism and interpretability for Deaf communities through advanced generative techniques.
Contribution
SignGAN is the first model to generate photo-realistic sign language videos directly from spoken language, utilizing a transformer architecture with MDN and a pose-conditioned synthesis approach.
Findings
SignGAN outperforms baseline methods in quantitative metrics.
Human perceptual studies favor SignGAN's realism.
The model enables controllable signer appearance.
Abstract
To be truly understandable and accepted by Deaf communities, an automatic Sign Language Production (SLP) system must generate a photo-realistic signer. Prior approaches based on graphical avatars have proven unpopular, whereas recent neural SLP works that produce skeleton pose sequences have been shown to be not understandable to Deaf viewers. In this paper, we propose SignGAN, the first SLP model to produce photo-realistic continuous sign language videos directly from spoken language. We employ a transformer architecture with a Mixture Density Network (MDN) formulation to handle the translation from spoken language to skeletal pose. A pose-conditioned human synthesis model is then introduced to generate a photo-realistic sign language video from the skeletal pose sequence. This allows the photo-realistic production of sign videos directly translated from written text. We further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Multimodal Machine Learning Applications
