Compact Convolutional Neural Network Cascade for Face Detection
Ilya Kalinovskii, Vladimir Spitsyn

TL;DR
This paper introduces a compact convolutional neural network cascade for face detection that achieves real-time performance on high-resolution videos with high efficiency and low computational cost, suitable for mobile platforms.
Contribution
The paper presents a novel, highly efficient CNN cascade for face detection that outperforms existing algorithms in speed while maintaining competitive accuracy.
Findings
Real-time detection of 4K Ultra HD videos at 27 fps.
High computational efficiency enables deployment on mobile devices.
Detection performance is robust to background complexity and object count.
Abstract
The problem of faces detection in images or video streams is a classical problem of computer vision. The multiple solutions of this problem have been proposed, but the question of their optimality is still open. Many algorithms achieve a high quality face detection, but at the cost of high computational complexity. This restricts their application in the real-time systems. This paper presents a new solution of the frontal face detection problem based on compact convolutional neural networks cascade. The test results on FDDB dataset show that it is competitive with state-of-the-art algorithms. This proposed detector is implemented using three technologies: SSE/AVX/AVX2 instruction sets for Intel CPUs, Nvidia CUDA, OpenCL. The detection speed of our approach considerably exceeds all the existing CPU-based and GPU-based algorithms. Because of high computational efficiency, our detector can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Face and Expression Recognition · Advanced Image and Video Retrieval Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
