Evaluation of Generative Models for Emotional 3D Animation Generation in VR
Kiran Chhatre, Renan Guarese, Andrii Matviienko, Christopher Peters

TL;DR
This study evaluates speech-driven 3D emotional animation models in VR using user-centric metrics, revealing strengths in emotion recognition but limitations in realism, naturalness, and enjoyment compared to real human expressions.
Contribution
It provides a comprehensive user study assessing emotional 3D animation models in VR, highlighting the importance of emotion modeling and user-centric evaluation metrics.
Findings
Emotion-specific models improve recognition accuracy.
Happy animations rated more realistic and natural than neutral.
Generative models lag behind reconstruction-based methods in facial expression quality.
Abstract
Social interactions incorporate nonverbal signals to convey emotions alongside speech, including facial expressions and body gestures. Generative models have demonstrated promising results in creating full-body nonverbal animations synchronized with speech; however, evaluations using statistical metrics in 2D settings fail to fully capture user-perceived emotions, limiting our understanding of model effectiveness. To address this, we evaluate emotional 3D animation generative models within a Virtual Reality (VR) environment, emphasizing user-centric metrics emotional arousal realism, naturalness, enjoyment, diversity, and interaction quality in a real-time human-agent interaction scenario. Through a user study (N=48), we examine perceived emotional quality for three state of the art speech-driven 3D animation methods across two emotions happiness (high arousal) and neutral (mid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
