Your "Attention" Deserves Attention: A Self-Diversified Multi-Channel   Attention for Facial Action Analysis

Xiaotian Li; Zhihua Li; Huiyuan Yang; Geran Zhao; Lijun Yin

arXiv:2203.12570·cs.CV·March 24, 2022

Your "Attention" Deserves Attention: A Self-Diversified Multi-Channel Attention for Facial Action Analysis

Xiaotian Li, Zhihua Li, Huiyuan Yang, Geran Zhao, Lijun Yin

PDF

TL;DR

This paper introduces SMA-Net, a novel self-diversified multi-channel attention model that improves the robustness and discriminative power of attention maps for facial action analysis, achieving state-of-the-art results.

Contribution

The paper proposes a new attention mechanism that enhances attention map quality and models inter-attention relationships for better facial action recognition.

Findings

01

Achieves superior performance on benchmark AU detection datasets.

02

Outperforms state-of-the-art methods in facial expression recognition.

03

Enhances the robustness and diversity of attention maps.

Abstract

Visual attention has been extensively studied for learning fine-grained features in both facial expression recognition (FER) and Action Unit (AU) detection. A broad range of previous research has explored how to use attention modules to localize detailed facial parts (e,g. facial action units), learn discriminative features, and learn inter-class correlation. However, few related works pay attention to the robustness of the attention module itself. Through experiments, we found neural attention maps initialized with different feature maps yield diverse representations when learning to attend the identical Region of Interest (ROI). In other words, similar to general feature learning, the representational quality of attention maps also greatly affects the performance of a model, which means unconstrained attention learning has lots of randomnesses. This uncertainty lets conventional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.