The DiffuseStyleGesture+ entry to the GENEA Challenge 2023

Sicheng Yang; Haiwei Xue; Zhensong Zhang; Minglei Li; Zhiyong Wu,; Xiaofei Wu; Songcen Xu; Zonghong Dai

arXiv:2308.13879·cs.HC·August 29, 2023

The DiffuseStyleGesture+ entry to the GENEA Challenge 2023

Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu,, Xiaofei Wu, Songcen Xu, Zonghong Dai

PDF

1 Repo

TL;DR

This paper presents DiffuseStyleGesture+, a diffusion model-based system for generating realistic conversational gestures for embodied agents, evaluated in the GENEA Challenge 2023, demonstrating competitive performance with top models.

Contribution

The paper introduces DiffuseStyleGesture+, a novel diffusion model approach that integrates multiple modalities for gesture generation, achieving state-of-the-art results in the GENEA Challenge 2023.

Findings

01

Performs on par with top models in human-likeness and appropriateness

02

Uses multimodal inputs including audio, text, speaker ID, and seed gestures

03

Achieves competitive results in gesture generation for conversational agents

Abstract

In this paper, we introduce the DiffuseStyleGesture+, our solution for the Generation and Evaluation of Non-verbal Behavior for Embodied Agents (GENEA) Challenge 2023, which aims to foster the development of realistic, automated systems for generating conversational gestures. Participants are provided with a pre-processed dataset and their systems are evaluated through crowdsourced scoring. Our proposed model, DiffuseStyleGesture+, leverages a diffusion model to generate gestures automatically. It incorporates a variety of modalities, including audio, text, speaker ID, and seed gestures. These diverse modalities are mapped to a hidden space and processed by a modified diffusion model to produce the corresponding gesture for a given speech input. Upon evaluation, the DiffuseStyleGesture+ demonstrated performance on par with the top-tier models in the challenge, showing no significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

youngseng/diffusestylegesture
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDiffusion