DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation   with Diffusion Models

Sicheng Yang; Zhiyong Wu; Minglei Li; Zhensong Zhang; Lei Hao; Weihong; Bao; Ming Cheng; Long Xiao

arXiv:2305.04919·cs.HC·May 9, 2023·5 cites

DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models

Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong, Bao, Ming Cheng, Long Xiao

PDF

Open Access 1 Repo 1 Models

TL;DR

DiffuseStyleGesture employs diffusion models with attention mechanisms to generate high-quality, stylized, and diverse co-speech gestures that match speech rhythm and semantics, advancing automatic gesture synthesis.

Contribution

It introduces a diffusion-based approach with attention mechanisms and style control for speech-driven gesture generation, improving realism and diversity.

Findings

01

Outperforms recent methods in gesture quality and diversity

02

Generates speech-matched and stylized gestures effectively

03

Enables style control through interpolation and extrapolation

Abstract

The art of communication beyond speech there are gestures. The automatic co-speech gesture generation draws much attention in computer animation. It is a challenging task due to the diversity of gestures and the difficulty of matching the rhythm and semantics of the gesture to the corresponding speech. To address these problems, we present DiffuseStyleGesture, a diffusion model based speech-driven gesture generation approach. It generates high-quality, speech-matched, stylized, and diverse co-speech gestures based on given speeches of arbitrary length. Specifically, we introduce cross-local attention and self-attention to the gesture diffusion pipeline to generate better speech matched and realistic gestures. We then train our model with classifier-free guidance to control the gesture style by interpolation or extrapolation. Additionally, we improve the diversity of generated gestures…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

youngseng/diffusestylegesture
pytorchOfficial

Models

🤗
youngseng/DiffuseStyleGesture
model· ♡ 3
♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Hand Gesture Recognition Systems · Human Pose and Action Recognition