Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation

Hendric Voss; Lisa Michelle Bohnenkamp; Stefan Kopp

arXiv:2510.17599·cs.HC·October 21, 2025

Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation

Hendric Voss, Lisa Michelle Bohnenkamp, Stefan Kopp

PDF

Open Access

TL;DR

This paper compares two co-speech gesture generation frameworks, AQ-GT and AQ-GT-a, revealing that semantic augmentation affects their ability to convey meaning and generalize in different contexts, with implications for gesture synthesis.

Contribution

It introduces and evaluates two frameworks for semantic co-speech gesture generation, highlighting the nuanced effects of semantic enrichment on performance and perception.

Findings

01

AQ-GT effectively conveys concepts within its training domain.

02

AQ-GT-a generalizes better to novel contexts.

03

Participants found AQ-GT-a gestures more expressive, but not more human-like.

Abstract

This study explores two frameworks for co-speech gesture generation, AQ-GT and its semantically-augmented variant AQ-GT-a, to evaluate their ability to convey meaning through gestures and how humans perceive the resulting movements. Using sentences from the SAGA spatial communication corpus, contextually similar sentences, and novel movement-focused sentences, we conducted a user-centered evaluation of concept recognition and human-likeness. Results revealed a nuanced relationship between semantic annotations and performance. The original AQ-GT framework, lacking explicit semantic input, was surprisingly more effective at conveying concepts within its training domain. Conversely, the AQ-GT-a framework demonstrated better generalization, particularly for representing shape and size in novel contexts. While participants rated gestures from AQ-GT-a as more expressive and helpful, they did…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHearing Impairment and Communication · Hand Gesture Recognition Systems · Action Observation and Synchronization