TL;DR
This paper introduces an end-to-end facial expression recognition architecture that uses a novel spatio-channel attention network and complementary context information, improving robustness to occlusions and pose variations without relying on landmark detectors.
Contribution
The work proposes a landmark-free, end-to-end FER model with a novel spatio-channel attention mechanism and a complementary context branch, enhancing robustness and performance.
Findings
Achieves superior accuracy on multiple datasets including in-the-wild and masked face datasets.
Demonstrates robustness to occlusions and pose variations.
Code is publicly available for reproducibility.
Abstract
A recent trend to recognize facial expressions in the real-world scenario is to deploy attention based convolutional neural networks (CNNs) locally to signify the importance of facial regions and, combine it with global facial features and/or other complementary context information for performance gain. However, in the presence of occlusions and pose variations, different channels respond differently, and further that the response intensity of a channel differ across spatial locations. Also, modern facial expression recognition(FER) architectures rely on external sources like landmark detectors for defining attention. Failure of landmark detector will have a cascading effect on FER. Additionally, there is no emphasis laid on the relevance of features that are input to compute complementary context information. Leveraging on the aforementioned observations, an end-to-end architecture for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGlobal Average Pooling · Convolution · Sigmoid Activation · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Average Pooling · Residual Connection · Efficient Channel Attention
