Deep Structure Inference Network for Facial Action Unit Recognition
Ciprian A. Corneanu, Meysam Madadi, Sergio Escalera

TL;DR
This paper introduces a deep neural network that combines local and global features with message passing to improve facial Action Unit recognition, achieving state-of-the-art results on benchmark datasets.
Contribution
A novel deep architecture that integrates feature learning with message passing for enhanced facial Action Unit recognition.
Findings
Improved accuracy by 5.3% on BP4D dataset.
Enhanced performance by 8.2% on DISFA dataset.
End-to-end training with increased supervision boosts results.
Abstract
Facial expressions are combinations of basic components called Action Units (AU). Recognizing AUs is key for developing general facial expression analysis. In recent years, most efforts in automatic AU recognition have been dedicated to learning combinations of local features and to exploiting correlations between Action Units. In this paper, we propose a deep neural architecture that tackles both problems by combining learned local and global features in its initial stages and replicating a message passing algorithm between classes similar to a graphical model inference approach in later stages. We show that by training the model end-to-end with increased supervision we improve state-of-the-art by 5.3% and 8.2% performance on BP4D and DISFA datasets, respectively.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
