Modeling Turn-Taking with Semantically Informed Gestures

Varsha Suresh; M. Hamza Mughal; Christian Theobalt; Vera Demberg

arXiv:2510.19350·cs.CL·March 23, 2026

Modeling Turn-Taking with Semantically Informed Gestures

Varsha Suresh, M. Hamza Mughal, Christian Theobalt, Vera Demberg

PDF

Open Access 1 Video

TL;DR

This paper introduces a new multimodal dataset with semantic gesture annotations and demonstrates that including gestures improves turn-taking prediction in conversation models.

Contribution

The study presents DnD Gesture++, a richly annotated dataset, and a Mixture-of-Experts model that effectively integrates gestures with speech and audio for turn-taking prediction.

Findings

01

Gestures provide complementary cues for turn-taking.

02

Incorporating gestures improves prediction accuracy.

03

Semantic gesture annotations enhance model performance.

Abstract

In conversation, humans use multimodal cues, such as speech, gestures, and gaze, to manage turn-taking. While linguistic and acoustic features are informative, gestures provide complementary cues for modeling these transitions. To study this, we introduce DnD Gesture++, an extension of the multi-party DnD Gesture corpus enriched with 2,663 semantic gesture annotations spanning iconic, metaphoric, deictic, and discourse types. Using this dataset, we model turn-taking prediction through a Mixture-of-Experts framework integrating text, audio, and gestures. Experiments show that incorporating semantically guided gestures yields consistent performance gains over baselines, demonstrating their complementary role in multimodal turn-taking.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Modeling Turn-Taking with Semantically Informed Gestures· underline

Taxonomy

TopicsHearing Impairment and Communication · Action Observation and Synchronization · Speech and dialogue systems