The GENEA Challenge 2023: A large scale evaluation of gesture generation   models in monadic and dyadic settings

Taras Kucherenko; Rajmund Nagy; Youngwoo Yoon; Jieyeon Woo; Teodor; Nikolov; Mihail Tsakov; Gustav Eje Henter

arXiv:2308.12646·cs.HC·August 25, 2023

The GENEA Challenge 2023: A large scale evaluation of gesture generation models in monadic and dyadic settings

Taras Kucherenko, Rajmund Nagy, Youngwoo Yoon, Jieyeon Woo, Teodor, Nikolov, Mihail Tsakov, Gustav Eje Henter

PDF

Open Access 2 Repos

TL;DR

The GENEA Challenge 2023 evaluated speech-driven gesture generation models in monadic and dyadic settings, revealing significant variability in human-likeness and highlighting the challenges in achieving natural and contextually appropriate gestures.

Contribution

This paper presents a large-scale evaluation of gesture generation models in dyadic interactions, providing insights into their performance and the gap to natural motion.

Findings

01

Wide variation in human-likeness among submissions

02

Most systems perform slightly above chance in appropriateness

03

Dyadic appropriateness is challenging and not strongly correlated with speech quality

Abstract

This paper reports on the GENEA Challenge 2023, in which participating teams built speech-driven gesture-generation systems using the same speech and motion dataset, followed by a joint evaluation. This year's challenge provided data on both sides of a dyadic interaction, allowing teams to generate full-body motion for an agent given its speech (text and audio) and the speech and motion of the interlocutor. We evaluated 12 submissions and 2 baselines together with held-out motion-capture data in several large-scale user studies. The studies focused on three aspects: 1) the human-likeness of the motion, 2) the appropriateness of the motion for the agent's own speech whilst controlling for the human-likeness of the motion, and 3) the appropriateness of the motion for the behaviour of the interlocutor in the interaction, using a setup that controls for both the human-likeness of the motion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Hand Gesture Recognition Systems · Social Robot Interaction and HRI