Discrete to Continuous: Generating Smooth Transition Poses from Sign   Language Observation

Shengeng Tang; Jiayi He; Lechao Cheng; Jingjing Wu; Dan Guo; Richang; Hong

arXiv:2411.16810·cs.CV·November 27, 2024

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation

Shengeng Tang, Jiayi He, Lechao Cheng, Jingjing Wu, Dan Guo, Richang, Hong

PDF

Open Access

TL;DR

This paper introduces Sign-D2C, a diffusion-based framework that generates smooth transition frames to convert discrete sign language segments into continuous, natural videos, addressing abrupt transition issues in prior methods.

Contribution

We propose a novel diffusion model approach that learns to generate transition frames for continuous sign language video synthesis, transforming an unsupervised problem into a supervised training task.

Findings

01

Effective in producing seamless sign language videos

02

Outperforms existing methods on multiple datasets

03

Generates natural, smooth transition sequences

Abstract

Generating continuous sign language videos from discrete segments is challenging due to the need for smooth transitions that preserve natural flow and meaning. Traditional approaches that simply concatenate isolated signs often result in abrupt transitions, disrupting video coherence. To address this, we propose a novel framework, Sign-D2C, that employs a conditional diffusion model to synthesize contextually smooth transition frames, enabling the seamless construction of continuous sign language sequences. Our approach transforms the unsupervised problem of transition frame generation into a supervised training task by simulating the absence of transition frames through random masking of segments in long-duration sign videos. The model learns to predict these masked frames by denoising Gaussian noise, conditioned on the surrounding sign observations, allowing it to handle complex,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Human Pose and Action Recognition

MethodsDiffusion