Incorporating Spatial Awareness in Data-Driven Gesture Generation for Virtual Agents
Anna Deichler, Simon Alexanderson, Jonas Beskow

TL;DR
This paper enhances gesture generation for virtual agents by integrating spatial context, enabling more natural and environment-aware non-verbal communication through a new synthetic dataset.
Contribution
It introduces a novel approach to incorporate scene information into data-driven gesture generation models for virtual agents.
Findings
Development of a synthetic gesture dataset with spatial context
Improved naturalness of gestures in virtual agents
Enhanced interaction capabilities with environment-aware gestures
Abstract
This paper focuses on enhancing human-agent communication by integrating spatial context into virtual agents' non-verbal behaviors, specifically gestures. Recent advances in co-speech gesture generation have primarily utilized data-driven methods, which create natural motion but limit the scope of gestures to those performed in a void. Our work aims to extend these methods by enabling generative models to incorporate scene information into speech-driven gesture synthesis. We introduce a novel synthetic gesture dataset tailored for this purpose. This development represents a critical step toward creating embodied conversational agents that interact more naturally with their environment and users.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
