Rethinking Occlusion in FER: A Semantic-Aware Perspective and Go Beyond

Huiyu Zhai; Xingxing Yang; Yalan Ye; Chenyang Li; Bin Fan; Changze Li

arXiv:2507.15401·cs.CV·July 25, 2025

Rethinking Occlusion in FER: A Semantic-Aware Perspective and Go Beyond

Huiyu Zhai, Xingxing Yang, Yalan Ye, Chenyang Li, Bin Fan, Changze Li

PDF

Open Access

TL;DR

This paper introduces ORSANet, a novel facial expression recognition model that leverages semantic segmentation, facial landmarks, and a dynamic loss function to improve accuracy under occlusion, supported by a new occlusion-focused dataset.

Contribution

The paper proposes a multi-modal semantic guidance approach with a new fusion module and a dynamic loss, along with the first occlusion-oriented FER dataset, advancing robustness in occlusion scenarios.

Findings

01

ORSANet achieves state-of-the-art performance on public benchmarks.

02

The model demonstrates robustness on the newly constructed Occlu-FER dataset.

03

Multi-modal semantic guidance improves facial expression recognition under occlusion.

Abstract

Facial expression recognition (FER) is a challenging task due to pervasive occlusion and dataset biases. Especially when facial information is partially occluded, existing FER models struggle to extract effective facial features, leading to inaccurate classifications. In response, we present ORSANet, which introduces the following three key contributions: First, we introduce auxiliary multi-modal semantic guidance to disambiguate facial occlusion and learn high-level semantic knowledge, which is two-fold: 1) we introduce semantic segmentation maps as dense semantics prior to generate semantics-enhanced facial representations; 2) we introduce facial landmarks as sparse geometric prior to mitigate intrinsic noises in FER, such as identity and gender biases. Second, to facilitate the effective incorporation of these two multi-modal priors, we customize a Multi-scale Cross-interaction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Face recognition and analysis · Domain Adaptation and Few-Shot Learning