Stable Signer: Hierarchical Sign Language Generative Model
Sen Fang, Yalin Feng, Hongbin Zhong, Yanxin Zhang, Dimitris N. Metaxas

TL;DR
This paper introduces Stable Signer, a hierarchical end-to-end model for sign language video generation that simplifies the process, improves accuracy, and achieves significant performance gains over existing methods.
Contribution
The paper proposes a novel hierarchical sign language generation framework with a new understanding linker and gesture rendering expert, streamlining the process and enhancing quality.
Findings
Performance improved by 48.6% over SOTA methods
Introduces SLUL for better text understanding in sign language generation
Develops SAGM Loss for improved training of understanding linkers
Abstract
Sign Language Production (SLP) is the process of converting the complex input text into a real video. Most previous works focused on the Text2Gloss, Gloss2Pose, Pose2Vid stages, and some concentrated on Prompt2Gloss and Text2Avatar stages. However, this field has made slow progress due to the inaccuracy of text conversion, pose generation, and the rendering of poses into real human videos in these stages, resulting in gradually accumulating errors. Therefore, in this paper, we streamline the traditional redundant structure, simplify and optimize the task objective, and design a new sign language generative model called Stable Signer. It redefines the SLP task as a hierarchical generation end-to-end task that only includes text understanding (Prompt2Gloss, Text2Gloss) and Pose2Vid, and executes text understanding through our proposed new Sign Language Understanding Linker called SLUL,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Interactive and Immersive Displays · Hearing Impairment and Communication
