TransFER: Learning Relation-aware Facial Expression Representations with   Transformers

Fanglei Xue; Qiangchang Wang; Guodong Guo

arXiv:2108.11116·cs.CV·August 26, 2021·1 cites

TransFER: Learning Relation-aware Facial Expression Representations with Transformers

Fanglei Xue, Qiangchang Wang, Guodong Guo

PDF

Open Access

TL;DR

TransFER introduces a relation-aware facial expression recognition model using transformers, employing novel attention dropping techniques to enhance local patch diversity and relation learning, leading to superior performance on FER benchmarks.

Contribution

The paper proposes TransFER, a transformer-based FER model with novel attention dropping methods (MAD and MSAD) to improve local patch diversity and relation modeling, advancing state-of-the-art results.

Findings

01

Outperforms existing FER methods on multiple benchmarks.

02

Effectively learns diverse local patches despite pose/viewpoint variations.

03

Enhances relation modeling among facial patches using transformers.

Abstract

Facial expression recognition (FER) has received increasing interest in computer vision. We propose the TransFER model which can learn rich relation-aware local representations. It mainly consists of three components: Multi-Attention Dropping (MAD), ViT-FER, and Multi-head Self-Attention Dropping (MSAD). First, local patches play an important role in distinguishing various expressions, however, few existing works can locate discriminative and diverse local patches. This can cause serious problems when some patches are invisible due to pose variations or viewpoint changes. To address this issue, the MAD is proposed to randomly drop an attention map. Consequently, models are pushed to explore diverse local patches adaptively. Second, to build rich relations between different local patches, the Vision Transformers (ViT) are used in FER, called ViT-FER. Since the global scope is used to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Advanced Computing and Algorithms · Gaze Tracking and Assistive Technology