Learning Disentangled Expression Representations from Facial Images
Marah Halawa, Manuel W\"ollhaf, Eduardo Vellasques, Urko S\'anchez, Sanz, and Olaf Hellwich

TL;DR
This paper introduces an adversarial learning approach to obtain disentangled facial expression representations, enhancing expression recognition accuracy on AffectNet without extra data.
Contribution
It proposes a novel adversarial loss formulation for disentangling facial factors, improving state-of-the-art expression recognition performance.
Findings
Achieved 60.53% accuracy on AffectNet dataset
Enabled learning from single-task datasets
Improved expression recognition without additional data
Abstract
Face images are subject to many different factors of variation, especially in unconstrained in-the-wild scenarios. For most tasks involving such images, e.g. expression recognition from video streams, having enough labeled data is prohibitively expensive. One common strategy to tackle such a problem is to learn disentangled representations for the different factors of variation of the observed data using adversarial learning. In this paper, we use a formulation of the adversarial loss to learn disentangled representations for face images. The used model facilitates learning on single-task datasets and improves the state-of-the-art in expression recognition with an accuracy of60.53%on the AffectNetdataset, without using any additional data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Human Pose and Action Recognition · Emotion and Mood Recognition
