Multi-label Transformer for Action Unit Detection

Gauthier Tallec; Edouard Yvinec; Arnaud Dapogny; Kevin Bailly

arXiv:2203.12531·cs.CV·December 13, 2022·6 cites

Multi-label Transformer for Action Unit Detection

Gauthier Tallec, Edouard Yvinec, Arnaud Dapogny, Kevin Bailly

PDF

Open Access

TL;DR

This paper introduces a multi-label detection transformer utilizing multi-head attention for facial Action Unit detection, leveraging large annotated datasets to improve recognition accuracy.

Contribution

It presents a novel transformer-based approach for AU detection that effectively identifies relevant facial regions for each action unit.

Findings

01

Improved AU detection accuracy on ABAW dataset

02

Effective use of multi-head attention for facial region relevance

03

Successful submission to ABAW3 challenge

Abstract

Action Unit (AU) Detection is the branch of affective computing that aims at recognizing unitary facial muscular movements. It is key to unlock unbiased computational face representations and has therefore aroused great interest in the past few years. One of the main obstacles toward building efficient deep learning based AU detection system is the lack of wide facial image databases annotated by AU experts. In that extent the ABAW challenge paves the way toward better AU detection as it involves a 2M frames AU annotated dataset. In this paper, we present our submission to the ABAW3 challenge. In a nutshell, we applied a multi-label detection transformer that leverage multi-head attention to learn which part of the face image is the most relevant to predict each AU.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Gaze Tracking and Assistive Technology

MethodsSoftmax · Linear Layer