Synthetically Expressive: Evaluating gesture and voice for emotion and empathy in VR and 2D scenarios
Haoyang Du, Kiran Chhatre, Christopher Peters, Brian Keegan, Rachel McDonnell, Cathy Ennis

TL;DR
This study evaluates how real and synthetic speech and gestures influence user perception of emotion and empathy in VR and 2D scenarios, revealing VR enhances natural pairings but not synthetic ones.
Contribution
It provides empirical insights into the effects of immersion and synthetic signals on emotional perception and co-presence in virtual environments.
Findings
VR enhances perception of natural gesture-voice matching
Synthetic gestures are perceived as less natural in VR
Immersion amplifies perceptual gaps between real and synthetic signals
Abstract
The creation of virtual humans increasingly leverages automated synthesis of speech and gestures, enabling expressive, adaptable agents that effectively engage users. However, the independent development of voice and gesture generation technologies, alongside the growing popularity of virtual reality (VR), presents significant questions about the integration of these signals and their ability to convey emotional detail in immersive environments. In this paper, we evaluate the influence of real and synthetic gestures and speech, alongside varying levels of immersion (VR vs. 2D displays) and emotional contexts (positive, neutral, negative) on user perceptions. We investigate how immersion affects the perceived match between gestures and speech and the impact on key aspects of user experience, including emotional and empathetic responses and the sense of co-presence. Our findings indicate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
