Rethinking Occlusion in FER: A Semantic-Aware Perspective and Go Beyond
Huiyu Zhai, Xingxing Yang, Yalan Ye, Chenyang Li, Bin Fan, Changze Li

TL;DR
This paper introduces ORSANet, a novel facial expression recognition model that leverages semantic segmentation, facial landmarks, and a dynamic loss function to improve accuracy under occlusion, supported by a new occlusion-focused dataset.
Contribution
The paper proposes a multi-modal semantic guidance approach with a new fusion module and a dynamic loss, along with the first occlusion-oriented FER dataset, advancing robustness in occlusion scenarios.
Findings
ORSANet achieves state-of-the-art performance on public benchmarks.
The model demonstrates robustness on the newly constructed Occlu-FER dataset.
Multi-modal semantic guidance improves facial expression recognition under occlusion.
Abstract
Facial expression recognition (FER) is a challenging task due to pervasive occlusion and dataset biases. Especially when facial information is partially occluded, existing FER models struggle to extract effective facial features, leading to inaccurate classifications. In response, we present ORSANet, which introduces the following three key contributions: First, we introduce auxiliary multi-modal semantic guidance to disambiguate facial occlusion and learn high-level semantic knowledge, which is two-fold: 1) we introduce semantic segmentation maps as dense semantics prior to generate semantics-enhanced facial representations; 2) we introduce facial landmarks as sparse geometric prior to mitigate intrinsic noises in FER, such as identity and gender biases. Second, to facilitate the effective incorporation of these two multi-modal priors, we customize a Multi-scale Cross-interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Face recognition and analysis · Domain Adaptation and Few-Shot Learning
