Your "Attention" Deserves Attention: A Self-Diversified Multi-Channel Attention for Facial Action Analysis
Xiaotian Li, Zhihua Li, Huiyuan Yang, Geran Zhao, Lijun Yin

TL;DR
This paper introduces SMA-Net, a novel self-diversified multi-channel attention model that improves the robustness and discriminative power of attention maps for facial action analysis, achieving state-of-the-art results.
Contribution
The paper proposes a new attention mechanism that enhances attention map quality and models inter-attention relationships for better facial action recognition.
Findings
Achieves superior performance on benchmark AU detection datasets.
Outperforms state-of-the-art methods in facial expression recognition.
Enhances the robustness and diversity of attention maps.
Abstract
Visual attention has been extensively studied for learning fine-grained features in both facial expression recognition (FER) and Action Unit (AU) detection. A broad range of previous research has explored how to use attention modules to localize detailed facial parts (e,g. facial action units), learn discriminative features, and learn inter-class correlation. However, few related works pay attention to the robustness of the attention module itself. Through experiments, we found neural attention maps initialized with different feature maps yield diverse representations when learning to attend the identical Region of Interest (ROI). In other words, similar to general feature learning, the representational quality of attention maps also greatly affects the performance of a model, which means unconstrained attention learning has lots of randomnesses. This uncertainty lets conventional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
