The Importance of the Instantaneous Phase in Detecting Faces with Convolutional Neural Networks
Luis Sanchez Tapia

TL;DR
This paper investigates how the instantaneous phase information enhances face detection accuracy in CNNs, using a block-based approach on unconstrained classroom video data, and compares it with amplitude and grayscale features.
Contribution
It introduces the use of instantaneous phase in frequency modulation images for face detection, reducing training overhead and improving interpretability in CNN-based methods.
Findings
Instantaneous phase improves face detection accuracy.
Frequency modulation images outperform amplitude and grayscale features.
Reduced training overhead with dominant component analysis.
Abstract
Convolutional Neural Networks (CNN) have provided new and accurate methods for processing digital images and videos. Yet, training CNNs is extremely demanding in terms of computational resources. Also, for specific applications, the standard use of transfer learning also tends to require far more resources than what may be needed. Furthermore, the final systems tend to operate as black boxes that are difficult to interpret. The current thesis considers the problem of detecting faces from the AOLME video dataset. The AOLME dataset consists of a large video collection of group interactions that are recorded in unconstrained classroom environments. For the thesis, still image frames were extracted at every minute from 18 24-minute videos. Then, each video frame was divided into 9x5 blocks with 50x50 pixels each. For each of the 19440 blocks, the percentage of face pixels was set as ground…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition
MethodsAttention Model
