StgcDiff: Spatial-Temporal Graph Condition Diffusion for Sign Language Transition Generation
Jiashu He, Jiayi He, Shengeng Tang, Huixia Ben, Lechao Cheng, Richang Hong

TL;DR
StgcDiff is a novel graph-based diffusion framework that generates smooth, coherent sign language transitions by modeling complex spatial-temporal dependencies, significantly improving over existing concatenation methods.
Contribution
We introduce a structure-aware, graph-based diffusion model with a Sign-GCN module for realistic sign language transition generation, capturing spatial-temporal cues more effectively.
Findings
Outperforms existing methods on PHOENIX14T, USTC-CSL100, and USTC-SLR500 datasets.
Produces more natural and semantically accurate sign language transitions.
Effectively models complex spatial-temporal dependencies in sign language data.
Abstract
Sign language transition generation seeks to convert discrete sign language segments into continuous sign videos by synthesizing smooth transitions. However,most existing methods merely concatenate isolated signs, resulting in poor visual coherence and semantic accuracy in the generated videos. Unlike textual languages,sign language is inherently rich in spatial-temporal cues, making it more complex to model. To address this,we propose StgcDiff, a graph-based conditional diffusion framework that generates smooth transitions between discrete signs by capturing the unique spatial-temporal dependencies of sign language. Specifically, we first train an encoder-decoder architecture to learn a structure-aware representation of spatial-temporal skeleton sequences. Next, we optimize a diffusion denoiser conditioned on the representations learned by the pre-trained encoder, which is tasked with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Speech and dialogue systems
