PFLD: A Practical Facial Landmark Detector
Xiaojie Guo, Siyuan Li, Jinke Yu, Jiawan Zhang, Jiayi Ma, Lin Ma, Wei, Liu, and Haibin Ling

TL;DR
This paper introduces PFLD, a compact and efficient facial landmark detector that achieves high accuracy and real-time speed on mobile devices, suitable for practical applications in unconstrained environments.
Contribution
The paper proposes a novel end-to-end single-stage network with a new loss function and training strategies, enabling accurate, fast, and lightweight facial landmark detection on mobile devices.
Findings
Outperforms state-of-the-art methods on 300W and AFLW benchmarks.
Achieves over 140 fps on a mobile phone with high precision.
Model size is only 2.1MB, suitable for real-time applications.
Abstract
Being accurate, efficient, and compact is essential to a facial landmark detector for practical use. To simultaneously consider the three concerns, this paper investigates a neat model with promising detection accuracy under wild environments e.g., unconstrained pose, expression, lighting, and occlusion conditions) and super real-time speed on a mobile device. More concretely, we customize an end-to-end single stage network associated with acceleration techniques. During the training phase, for each sample, rotation information is estimated for geometrically regularizing landmark localization, which is then NOT involved in the testing phase. A novel loss is designed to, besides considering the geometrical regularization, mitigate the issue of data imbalance by adjusting weights of samples to different states, such as large pose, extreme lighting, and occlusion, in the training set.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Biometric Identification and Security · Speech and Audio Processing
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
