A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications
Le Thanh Nguyen-Meidine, Eric Granger, Madhu Kiran, Louis-Antoine, Blais-Morin

TL;DR
This paper compares various CNN architectures for face and head detection in real-time video surveillance, evaluating their accuracy and computational efficiency to determine their practicality in real-world applications.
Contribution
It provides an empirical comparison of state-of-the-art CNN architectures for face and head detection, focusing on accuracy and real-time computational feasibility.
Findings
CNN architectures achieve high accuracy over traditional methods
Computational complexity limits real-time application viability
Region-based architectures offer a balance between accuracy and efficiency
Abstract
Detecting faces and heads appearing in video feeds are challenging tasks in real-world video surveillance applications due to variations in appearance, occlusions and complex backgrounds. Recently, several CNN architectures have been proposed to increase the accuracy of detectors, although their computational complexity can be an issue, especially for real-time applications, where faces and heads must be detected live using high-resolution cameras. This paper compares the accuracy and complexity of state-of-the-art CNN architectures that are suitable for face and head detection. Single pass and region-based architectures are reviewed and compared empirically to baseline techniques according to accuracy and to time and memory complexity on images from several challenging datasets. The viability of these architectures is analyzed with real-time video surveillance applications in mind.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
