TL;DR
This paper introduces a manifold network-based covariance pooling method within deep learning models to enhance facial expression recognition, achieving state-of-the-art accuracy on multiple datasets and effectively capturing regional facial distortions.
Contribution
It proposes a novel manifold network structure for covariance pooling integrated with convolutional networks, improving both spatial and temporal facial expression recognition.
Findings
Achieved 58.14% accuracy on SFEW 2.0 dataset.
Achieved 87.0% accuracy on RAF database.
Demonstrated superiority over existing methods in facial expression recognition.
Abstract
Classifying facial expressions into different categories requires capturing regional distortions of facial landmarks. We believe that second-order statistics such as covariance is better able to capture such distortions in regional facial fea- tures. In this work, we explore the benefits of using a man- ifold network structure for covariance pooling to improve facial expression recognition. In particular, we first employ such kind of manifold networks in conjunction with tradi- tional convolutional networks for spatial pooling within in- dividual image feature maps in an end-to-end deep learning manner. By doing so, we are able to achieve a recognition accuracy of 58.14% on the validation set of Static Facial Expressions in the Wild (SFEW 2.0) and 87.0% on the vali- dation set of Real-World Affective Faces (RAF) Database. Both of these results are the best results we are aware of.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
