EResFD: Rediscovery of the Effectiveness of Standard Convolution for Lightweight Face Detection
Joonhyun Jeong, Beomyoung Kim, Joonsang Yu, Youngjoon Yoo

TL;DR
This paper demonstrates that heavily channel-pruned standard convolutional blocks in ResNet can outperform depthwise separable convolutions in lightweight face detection, offering better accuracy and speed.
Contribution
The study re-evaluates the effectiveness of standard convolution in lightweight face detection and proposes a ResNet-based backbone with reduced channels for improved efficiency.
Findings
Achieves 80.4% mAP on WIDER FACE Hard subset
Runs inference at 37.7 ms for VGA images on CPU
Outperforms MobileNet variants in accuracy and speed
Abstract
This paper analyzes the design choices of face detection architecture that improve efficiency of computation cost and accuracy. Specifically, we re-examine the effectiveness of the standard convolutional block as a lightweight backbone architecture for face detection. Unlike the current tendency of lightweight architecture design, which heavily utilizes depthwise separable convolution layers, we show that heavily channel-pruned standard convolution layers can achieve better accuracy and inference speed when using a similar parameter size. This observation is supported by the analyses concerning the characteristics of the target data domain, faces. Based on our observation, we propose to employ ResNet with a highly reduced channel, which surprisingly allows high efficiency compared to other mobile-friendly networks (e.g., MobileNetV1, V2, V3). From the extensive experiments, we show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
EResFD: Rediscovery of the Effectiveness of Standard Convolution for Lightweight Face Detection· youtube
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Biometric Identification and Security
Methods*Communicated@Fast*How Do I Communicate to Expedia? · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Average Pooling · Residual Connection · 1x1 Convolution · Global Average Pooling · Depthwise Convolution · Max Pooling · Batch Normalization · Residual Block
