# Toward Generalized Emotion Recognition in VR by Bridging Natural and Acted Facial Expressions

**Authors:** Rahat Rizvi Rahman, Hee Yun Choi, Joonghyo Lim, Go Eun Lee, Seungmoo Lee, Chungyean Cho, Kostadin Damevski

PMC · DOI: 10.3390/s26030845 · Sensors (Basel, Switzerland) · 2026-01-28

## TL;DR

This paper improves emotion recognition in VR by combining acted and natural facial expressions, leading to more accurate and robust systems.

## Contribution

The study introduces a novel approach to train emotion recognition models using both acted and natural expressions in VR.

## Key findings

- Models trained on both acted and natural data show better cross-domain generalization.
- Domain-adversarial and mixture-of-experts models achieved highest accuracy on natural and mixed-emotion evaluations.
- Generalizable models learn shared facial action unit patterns from both expression types.

## Abstract

Recognizing emotions accurately in virtual reality (VR) enables adaptive and personalized experiences across gaming, therapy, and other domains. However, most existing facial emotion recognition models rely on acted expressions collected under controlled settings, which differ substantially from the spontaneous and subtle emotions that arise during real VR experiences. To address this challenge, the objective of this study is to develop and evaluate generalizable emotion recognition models that jointly learn from both acted and natural facial expressions in virtual reality. We integrate two complementary datasets collected using the Meta Quest Pro headset, one capturing natural emotional reactions and another containing acted expressions. We evaluate multiple model architectures, including convolutional and domain-adversarial networks, and a mixture-of-experts model that separates natural and acted expressions. Our experiments show that models trained jointly on acted and natural data achieve stronger cross-domain generalization. In particular, the domain-adversarial and mixture-of-experts configurations yield the highest accuracy on natural and mixed-emotion evaluations. Analysis of facial action units (AUs) reveals that natural and acted emotions rely on partially distinct AU patterns, while generalizable models learn a shared representation that integrates salient AUs from both domains. These findings demonstrate that bridging acted and natural expression domains can enable more accurate and robust VR emotion recognition systems.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12899423/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12899423/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/PMC12899423/full.md

---
Source: https://tomesphere.com/paper/PMC12899423