PFLD: A Practical Facial Landmark Detector

Xiaojie Guo; Siyuan Li; Jinke Yu; Jiawan Zhang; Jiayi Ma; Lin Ma; Wei; Liu; and Haibin Ling

arXiv:1902.10859·cs.CV·March 5, 2019·35 cites

PFLD: A Practical Facial Landmark Detector

Xiaojie Guo, Siyuan Li, Jinke Yu, Jiawan Zhang, Jiayi Ma, Lin Ma, Wei, Liu, and Haibin Ling

PDF

Open Access 5 Repos

TL;DR

This paper introduces PFLD, a compact and efficient facial landmark detector that achieves high accuracy and real-time speed on mobile devices, suitable for practical applications in unconstrained environments.

Contribution

The paper proposes a novel end-to-end single-stage network with a new loss function and training strategies, enabling accurate, fast, and lightweight facial landmark detection on mobile devices.

Findings

01

Outperforms state-of-the-art methods on 300W and AFLW benchmarks.

02

Achieves over 140 fps on a mobile phone with high precision.

03

Model size is only 2.1MB, suitable for real-time applications.

Abstract

Being accurate, efficient, and compact is essential to a facial landmark detector for practical use. To simultaneously consider the three concerns, this paper investigates a neat model with promising detection accuracy under wild environments e.g., unconstrained pose, expression, lighting, and occlusion conditions) and super real-time speed on a mobile device. More concretely, we customize an end-to-end single stage network associated with acceleration techniques. During the training phase, for each sample, rotation information is estimated for geometrically regularizing landmark localization, which is then NOT involved in the testing phase. A novel loss is designed to, besides considering the geometrical regularization, mitigate the issue of data imbalance by adjusting weights of samples to different states, such as large pose, extreme lighting, and occlusion, in the training set.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Biometric Identification and Security · Speech and Audio Processing

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings