Affective Behaviour Analysis Using Pretrained Model with Facial Priori

Yifan Li; Haomiao Sun; Zhaori Liu; Hu Han

arXiv:2207.11679·cs.CV·September 20, 2022

Affective Behaviour Analysis Using Pretrained Model with Facial Priori

Yifan Li, Haomiao Sun, Zhaori Liu, Hu Han

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel affective behaviour analysis method leveraging pretrained facial models, combining Masked Auto-Encoder and Vision Transformer with AffectNet CNN, to improve emotion recognition accuracy and reduce annotation effort.

Contribution

It proposes a multi-task framework using pretrained models and a co-training strategy with independent views for more effective emotion analysis.

Findings

01

Effective on ABAW4 dataset

02

Improved valence-arousal regression accuracy

03

Demonstrated robustness of the co-training approach

Abstract

Affective behaviour analysis has aroused researchers' attention due to its broad applications. However, it is labor exhaustive to obtain accurate annotations for massive face images. Thus, we propose to utilize the prior facial information via Masked Auto-Encoder (MAE) pretrained on unlabeled face images. Furthermore, we combine MAE pretrained Vision Transformer (ViT) and AffectNet pretrained CNN to perform multi-task emotion recognition. We notice that expression and action unit (AU) scores are pure and intact features for valence-arousal (VA) regression. As a result, we utilize AffectNet pretrained CNN to extract expression scores concatenating with expression and AU scores from ViT to obtain the final VA features. Moreover, we also propose a co-training framework with two parallel MAE pretrained ViT for expression recognition tasks. In order to make the two views independent, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jackyfl/emma_cotex_abaw4
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition

MethodsAttention Is All You Need · Masked autoencoder · Linear Layer · Position-Wise Feed-Forward Layer · Softmax · Byte Pair Encoding · Vision Transformer · Adam · Label Smoothing · Dense Connections