KPNet: Towards Minimal Face Detector

Guanglu Song; Yu Liu; Yuhang Zang; Xiaogang Wang; Biao Leng; Qingsheng; Yuan

arXiv:2003.07543·cs.CV·March 18, 2020·1 cites

KPNet: Towards Minimal Face Detector

Guanglu Song, Yu Liu, Yuhang Zang, Xiaogang Wang, Biao Leng, Qingsheng, Yuan

PDF

Open Access

TL;DR

KPNet is a minimal, fast, and accurate face detector that uses facial keypoints to infer face bounding boxes, outperforming complex models with only about 1 million parameters.

Contribution

The paper introduces KPNet, a bottom-up face detection method that detects facial keypoints to accurately infer face bounding boxes using a minimal neural network.

Findings

01

Achieves state-of-the-art accuracy on face detection benchmarks.

02

Runs at 1000fps on GPU with only ~1M parameters.

03

Operates effectively in real-time on modern front-end chips.

Abstract

The small receptive field and capacity of minimal neural networks limit their performance when using them to be the backbone of detectors. In this work, we find that the appearance feature of a generic face is discriminative enough for a tiny and shallow neural network to verify from the background. And the essential barriers behind us are 1) the vague definition of the face bounding box and 2) tricky design of anchor-boxes or receptive field. Unlike most top-down methods for joint face detection and alignment, the proposed KPNet detects small facial keypoints instead of the whole face by in a bottom-up manner. It first predicts the facial landmarks from a low-resolution image via the well-designed fine-grained scale approximation and scale adaptive soft-argmax operator. Finally, the precise face bounding boxes, no matter how we define it, can be inferred from the keypoints. Without any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Face and Expression Recognition · Advanced Image and Video Retrieval Techniques