Stacked Dense U-Nets with Dual Transformers for Robust Face Alignment
Jia Guo, Jiankang Deng, Niannan Xue, Stefanos Zafeiriou

TL;DR
This paper introduces a novel stacked dense U-Net architecture with dual transformers and deformable convolutions for robust face landmark localization, achieving state-of-the-art results in challenging in-the-wild conditions.
Contribution
It proposes a new architecture combining stacked dense U-Nets with scale and channel aggregation, deformable convolutions, and coherent loss for improved robustness in face alignment.
Findings
Robust face alignment under extreme poses and occlusions
Achieved state-of-the-art accuracy on CFP-FP dataset
Enhanced 3D face alignment benefits pose-invariant recognition
Abstract
Facial landmark localisation in images captured in-the-wild is an important and challenging problem. The current state-of-the-art revolves around certain kinds of Deep Convolutional Neural Networks (DCNNs) such as stacked U-Nets and Hourglass networks. In this work, we innovatively propose stacked dense U-Nets for this task. We design a novel scale aggregation network topology structure and a channel aggregation building block to improve the model's capacity without sacrificing the computational complexity and model size. With the assistance of deformable convolutions inside the stacked dense U-Nets and coherent loss for outside data transformation, our model obtains the ability to be spatially invariant to arbitrary input face images. Extensive experiments on many in-the-wild datasets, validate the robustness of the proposed method under extreme poses, exaggerated expressions and heavy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Generative Adversarial Networks and Image Synthesis
