Facial Expression Recognition Using Residual Masking Network

Luan Pham; The Huynh Vu; Tuan Anh Tran

arXiv:2603.05937·cs.CV·March 9, 2026

Facial Expression Recognition Using Residual Masking Network

Luan Pham, The Huynh Vu, Tuan Anh Tran

PDF

Open Access

TL;DR

This paper introduces a Residual Masking Network that combines residual and U-Net architectures with a segmentation-based attention mechanism to improve facial expression recognition accuracy, achieving state-of-the-art results on benchmark datasets.

Contribution

It presents a novel Residual Masking Network that enhances CNN performance in FER by integrating segmentation-based feature refinement, a new approach in deep facial expression recognition.

Findings

01

Achieved state-of-the-art accuracy on FER2013 dataset.

02

Outperformed existing methods on private VEMO dataset.

03

Demonstrated effectiveness of segmentation-based attention in FER.

Abstract

Automatic facial expression recognition (FER) has gained much attention due to its applications in human-computer interaction. Among the approaches to improve FER tasks, this paper focuses on deep architecture with the attention mechanism. We propose a novel Masking idea to boost the performance of CNN in facial expression task. It uses a segmentation network to refine feature maps, enabling the network to focus on relevant information to make correct decisions. In experiments, we combine the ubiquitous Deep Residual Network and Unet-like architecture to produce a Residual Masking Network. The proposed method holds state-of-the-art (SOTA) accuracy on the well-known FER2013 and private VEMO datasets. The source code is available at https://github.com/phamquiluan/ResidualMaskingNetwork.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Face recognition and analysis · Face and Expression Recognition