FlashSign: Pose-Free Guidance for Efficient Sign Language Video Generation

Liuzhou Zhang; Zeyu Zhang; Biao Wu; Luyao Tang; Zirui Song; Hongyang He; Renda Han; Guangzhen Yao; Huacan Wang; Ronghao Chen; Xiuying Chen; Guan Huang; Zheng Zhu

arXiv:2603.27915·cs.CV·March 31, 2026

FlashSign: Pose-Free Guidance for Efficient Sign Language Video Generation

Liuzhou Zhang, Zeyu Zhang, Biao Wu, Luyao Tang, Zirui Song, Hongyang He, Renda Han, Guangzhen Yao, Huacan Wang, Ronghao Chen, Xiuying Chen, Guan Huang, Zheng Zhu

PDF

1 Repo

TL;DR

This paper introduces FlashSign, a real-time sign language video generation framework that is pose-free, diffusion-based, and incorporates a novel attention mechanism to improve efficiency and quality.

Contribution

The work presents a pose-free, diffusion-based sign language video generator with a trainable attention mechanism, achieving 3.07x faster inference without quality loss.

Findings

01

Increases video generation speed by 3.07x

02

Eliminates reliance on pose estimation for sign language synthesis

03

Maintains high quality in real-time sign language video generation

Abstract

Sign language plays a crucial role in bridging communication gaps between the deaf and hard-of-hearing communities. However, existing sign language video generation models often rely on complex intermediate representations, which limits their flexibility and efficiency. In this work, we propose a novel pose-free framework for real-time sign language video generation. Our method eliminates the need for intermediate pose representations by directly mapping natural language text to sign language videos using a diffusion-based approach. We introduce two key innovations: (1) a pose-free generative model based on the a state-of-the-art diffusion backbone, which learns implicit text-to-gesture alignments without pose estimation, and (2) a Trainable Sliding Tile Attention (T-STA) mechanism that accelerates inference by exploiting spatio-temporal locality patterns. Unlike previous training-free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AIGeeksGroup/FlashSign
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.