TL;DR
EmoNeXt is a novel deep learning framework that enhances facial emotion recognition by integrating ConvNeXt with spatial attention, channel dependencies, and self-attention regularization, achieving superior accuracy on FER2013.
Contribution
The paper introduces EmoNeXt, an adapted ConvNeXt architecture with novel modules for improved facial emotion recognition performance.
Findings
Outperforms existing models on FER2013 dataset
Achieves higher emotion classification accuracy
Effectively captures facial features with integrated modules
Abstract
Facial expressions play a crucial role in human communication serving as a powerful and impactful means to express a wide range of emotions. With advancements in artificial intelligence and computer vision, deep neural networks have emerged as effective tools for facial emotion recognition. In this paper, we propose EmoNeXt, a novel deep learning framework for facial expression recognition based on an adapted ConvNeXt architecture network. We integrate a Spatial Transformer Network (STN) to focus on feature-rich regions of the face and Squeeze-and-Excitation blocks to capture channel-wise dependencies. Moreover, we introduce a self-attention regularization term, encouraging the model to generate compact feature vectors. We demonstrate the superiority of our model over existing state-of-the-art deep learning models on the FER2013 dataset regarding emotion classification accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Absolute Position Encodings · Adam · Residual Connection · Dropout · Softmax · Byte Pair Encoding · Linear Layer · ConvNeXt · Multi-Head Attention
