Unimodal and Multimodal Static Facial Expression Recognition for Virtual Reality Users with EmoHeVRDB
Thorben Ortmann, Qi Wang, Larissa Putzar

TL;DR
This paper investigates facial expression recognition in VR using facial activation data from VR headsets, achieving up to 80.42% accuracy with multimodal approaches and establishing new benchmarks for VR-based FER.
Contribution
It introduces the first use of EmoHeVRDB's facial activation data for unimodal and multimodal static FER, significantly improving recognition accuracy in VR environments.
Findings
Unimodal FER accuracy reached 73.02%.
Multimodal fusion improved accuracy to 80.42%.
Fusing modalities surpasses image-only methods in VR.
Abstract
In this study, we explored the potential of utilizing Facial Expression Activations (FEAs) captured via the Meta Quest Pro Virtual Reality (VR) headset for Facial Expression Recognition (FER) in VR settings. Leveraging the EmojiHeroVR Database (EmoHeVRDB), we compared several unimodal approaches and achieved up to 73.02% accuracy for the static FER task with seven emotion categories. Furthermore, we integrated FEA and image data in multimodal approaches, observing significant improvements in recognition accuracy. An intermediate fusion approach achieved the highest accuracy of 80.42%, significantly surpassing the baseline evaluation result of 69.84% reported for EmoHeVRDB's image data. Our study is the first to utilize EmoHeVRDB's unique FEA data for unimodal and multimodal static FER, establishing new benchmarks for FER in VR settings. Our findings highlight the potential of fusing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition
