Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Shengeng Tang, Jiayi He, Lechao Cheng, Jingjing Wu, Dan Guo, Richang, Hong

TL;DR
This paper introduces Sign-D2C, a diffusion-based framework that generates smooth transition frames to convert discrete sign language segments into continuous, natural videos, addressing abrupt transition issues in prior methods.
Contribution
We propose a novel diffusion model approach that learns to generate transition frames for continuous sign language video synthesis, transforming an unsupervised problem into a supervised training task.
Findings
Effective in producing seamless sign language videos
Outperforms existing methods on multiple datasets
Generates natural, smooth transition sequences
Abstract
Generating continuous sign language videos from discrete segments is challenging due to the need for smooth transitions that preserve natural flow and meaning. Traditional approaches that simply concatenate isolated signs often result in abrupt transitions, disrupting video coherence. To address this, we propose a novel framework, Sign-D2C, that employs a conditional diffusion model to synthesize contextually smooth transition frames, enabling the seamless construction of continuous sign language sequences. Our approach transforms the unsupervised problem of transition frame generation into a supervised training task by simulating the absence of transition frames through random masking of segments in long-duration sign videos. The model learns to predict these masked frames by denoising Gaussian noise, conditioned on the surrounding sign observations, allowing it to handle complex,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Human Pose and Action Recognition
MethodsDiffusion
