Facial Expression Recognition Using Residual Masking Network
Luan Pham, The Huynh Vu, Tuan Anh Tran

TL;DR
This paper introduces a Residual Masking Network that combines residual and U-Net architectures with a segmentation-based attention mechanism to improve facial expression recognition accuracy, achieving state-of-the-art results on benchmark datasets.
Contribution
It presents a novel Residual Masking Network that enhances CNN performance in FER by integrating segmentation-based feature refinement, a new approach in deep facial expression recognition.
Findings
Achieved state-of-the-art accuracy on FER2013 dataset.
Outperformed existing methods on private VEMO dataset.
Demonstrated effectiveness of segmentation-based attention in FER.
Abstract
Automatic facial expression recognition (FER) has gained much attention due to its applications in human-computer interaction. Among the approaches to improve FER tasks, this paper focuses on deep architecture with the attention mechanism. We propose a novel Masking idea to boost the performance of CNN in facial expression task. It uses a segmentation network to refine feature maps, enabling the network to focus on relevant information to make correct decisions. In experiments, we combine the ubiquitous Deep Residual Network and Unet-like architecture to produce a Residual Masking Network. The proposed method holds state-of-the-art (SOTA) accuracy on the well-known FER2013 and private VEMO datasets. The source code is available at https://github.com/phamquiluan/ResidualMaskingNetwork.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Face recognition and analysis · Face and Expression Recognition
