TL;DR
This paper introduces MicroExpNet, a tiny, fast CNN for facial expression recognition, leveraging knowledge distillation and challenging assumptions about max-pooling's usefulness, achieving real-time performance with insights into model design.
Contribution
Proposes MicroExpNet, an extremely small and fast CNN for FER, and explores the effects of knowledge distillation and max-pooling on model performance and generalization.
Findings
MicroExpNet is less than 1MB and runs at 1851 fps on CPU.
Knowledge distillation significantly improves small model performance.
Max-pooling unexpectedly enhances generalization in FER models.
Abstract
This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. To this end, we employed the popular knowledge distillation (KD) method and identified two major shortcomings with its use: 1) a fine-grained grid search is needed for tuning the temperature hyperparameter and 2) to find the optimal size-accuracy balance, one needs to search for the final network size (or the compression rate). On the other hand, KD is proved to be useful for model compression for the FER problem, and we discovered that its effects gets more and more significant with the decreasing model size. In addition, we hypothesized that translation invariance achieved using max-pooling layers would not be useful for the FER problem as the expressions are sensitive to small, pixel-wise changes around the eye…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
