TL;DR
This paper introduces CMNet, a cross-modal neural network leveraging facial symmetry and structural cues for improved facial expression recognition, with a novel salient information refinement and half-face alignment mechanism.
Contribution
The paper presents a new cross-modal network architecture that effectively combines facial structural information and symmetry, outperforming existing methods in expression recognition.
Findings
CMNet outperforms several recent methods like SCN and LAENet-SA.
The salient facial information refinement improves classifier stability.
Half-face alignment enhances the robustness of expression feature extraction.
Abstract
Deep neural networks enriched with structural information have been widely employed for facial expression recognition tasks. However, these methods often depend on hierarchical information rather than face property to finish expression recognition. In this paper, we propose a cross-modal network with strong biological and structural information for facial expression recognition (CMNet). CMNet can respectively learn expression information via face symmetry on a whole face, left and right half faces to extract complementary facial features. To prevent negative effect of biological and structural information fusion, a salient facial information refinement module can obtain salient facial expression information to improve stability of an obtained facial expression classifier. To reduce reliance on unilateral facial features, a half-face alignment optimization mechanism is designed to align…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
