Recognizing Facial Expressions in the Wild using Multi-Architectural Representations based Ensemble Learning with Distillation
Rauf Momin, Ali Shan Momin, Khalid Rasheed, Muhammad Saqib

TL;DR
This paper introduces EmoXNet, an ensemble learning model, and EmoXNetLite, a distilled, efficient neural network for facial expression recognition, achieving high accuracy and good generalization on benchmark datasets.
Contribution
The paper presents a novel ensemble model and a distillation approach for real-time facial expression recognition, improving accuracy and efficiency.
Findings
EmoXNet achieved 85.07% accuracy on FER2013.
EmoXNetLite achieved 82.07% accuracy on FER2013.
Models generalize well on new data.
Abstract
Facial expressions are the most common universal forms of body language. In the past few years, automatic facial expression recognition (FER) has been an active field of research. However, it is still a challenging task due to different uncertainties and complications. Nevertheless, efficiency and performance are yet essential aspects for building robust systems. We proposed two models, EmoXNet which is an ensemble learning technique for learning convoluted facial representations, and EmoXNetLite which is a distillation technique that is useful for transferring the knowledge from our ensemble model to an efficient deep neural network using label-smoothen soft labels for able to effectively detect expressions in real-time. Both of the techniques performed quite well, where the ensemble model (EmoXNet) helped to achieve 85.07% test accuracy on FER2013 with FER+ annotations and 86.25% test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Face and Expression Recognition · Hand Gesture Recognition Systems
