NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression Recognition
Hanting Li, Mingzhe Sui, Zhaoqing Zhu, and Feng zhao

TL;DR
This paper introduces NR-DFERNet, a noise-robust network for dynamic facial expression recognition in the wild, effectively filtering noisy frames and improving accuracy on challenging benchmarks.
Contribution
The paper proposes a novel noise-robust DFER network with a dynamic-static fusion module, a dynamic class token, and a snippet-based filter, addressing noise interference in wild videos.
Findings
Outperforms state-of-the-art on DFEW benchmark
Effective noise reduction in noisy video frames
Improved accuracy in wild facial expression recognition
Abstract
Dynamic facial expression recognition (DFER) in the wild is an extremely challenging task, due to a large number of noisy frames in the video sequences. Previous works focus on extracting more discriminative features, but ignore distinguishing the key frames from the noisy frames. To tackle this problem, we propose a noise-robust dynamic facial expression recognition network (NR-DFERNet), which can effectively reduce the interference of noisy frames on the DFER task. Specifically, at the spatial stage, we devise a dynamic-static fusion module (DSF) that introduces dynamic features to static features for learning more discriminative spatial features. To suppress the impact of target irrelevant frames, we introduce a novel dynamic class token (DCT) for the transformer at the temporal stage. Moreover, we design a snippet-based filter (SF) at the decision stage to reduce the effect of too…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Advanced Computing and Algorithms · Gaze Tracking and Assistive Technology
