EMoG: Synthesizing Emotive Co-speech 3D Gesture with Diffusion Model

Lianying Yin; Yijun Wang; Tianyu He; Jinming Liu; Wei Zhao; Bohan Li,; Xin Jin; Jianxin Lin

arXiv:2306.11496·cs.CV·June 21, 2023·2 cites

EMoG: Synthesizing Emotive Co-speech 3D Gesture with Diffusion Model

Lianying Yin, Yijun Wang, Tianyu He, Jinming Liu, Wei Zhao, Bohan Li,, Xin Jin, Jianxin Lin

PDF

Open Access

TL;DR

This paper introduces EMoG, a diffusion model-based framework that synthesizes emotive co-speech 3D gestures by addressing diversity and joint correlation challenges, outperforming previous methods.

Contribution

The paper proposes a novel diffusion model framework with emotion guidance and a joint correlation-aware transformer for improved gesture synthesis.

Findings

01

Outperforms previous state-of-the-art methods

02

Effectively models emotion-guided gesture generation

03

Demonstrates superior diversity and realism in synthesized gestures

Abstract

Although previous co-speech gesture generation methods are able to synthesize motions in line with speech content, it is still not enough to handle diverse and complicated motion distribution. The key challenges are: 1) the one-to-many nature between the speech content and gestures; 2) the correlation modeling between the body joints. In this paper, we present a novel framework (EMoG) to tackle the above challenges with denoising diffusion models: 1) To alleviate the one-to-many problem, we incorporate emotion clues to guide the generation process, making the generation much easier; 2) To model joint correlation, we propose to decompose the difficult gesture generation into two sub-problems: joint correlation modeling and temporal dynamics modeling. Then, the two sub-problems are explicitly tackled with our proposed Joint Correlation-aware transFormer (JCFormer). Through extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Hand Gesture Recognition Systems

MethodsDiffusion