AdaDistill: Adaptive Knowledge Distillation for Deep Face Recognition
Fadi Boutros, Vitomir \v{S}truc, Naser Damer

TL;DR
AdaDistill introduces an adaptive knowledge distillation method for deep face recognition that dynamically adjusts the complexity of distilled knowledge during training without hyper-parameter tuning, leading to improved performance.
Contribution
The paper proposes AdaDistill, a novel adaptive KD approach that adjusts the complexity of knowledge distillation based on the student's learning stage, enhancing face recognition accuracy.
Findings
Outperforms state-of-the-art methods on IJB-B, IJB-C, and ICCV2021-MFR benchmarks.
Effectively adjusts the amount of complex knowledge distilled at different training stages.
Improves the discriminative capability of compact face recognition models.
Abstract
Knowledge distillation (KD) aims at improving the performance of a compact student model by distilling the knowledge from a high-performing teacher model. In this paper, we present an adaptive KD approach, namely AdaDistill, for deep face recognition. The proposed AdaDistill embeds the KD concept into the softmax loss by training the student using a margin penalty softmax loss with distilled class centers from the teacher. Being aware of the relatively low capacity of the compact student model, we propose to distill less complex knowledge at an early stage of training and more complex one at a later stage of training. This relative adjustment of the distilled knowledge is controlled by the progression of the learning capability of the student over the training iterations without the need to tune any hyper-parameters. Extensive experiments and ablation studies show that AdaDistill can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition
MethodsSoftmax · Attentive Walk-Aggregating Graph Neural Network
