Knowledge Distillation in Deep Learning and its Applications
Abdolmaged Alkhulaifi, Fahad Alsahli, Irfan Ahmad

TL;DR
This paper surveys knowledge distillation techniques in deep learning, introduces a new metric for comparing methods, and discusses their effectiveness in deploying smaller models on resource-limited devices.
Contribution
It provides a comprehensive survey of knowledge distillation methods and proposes a novel metric for performance comparison.
Findings
Knowledge distillation improves model deployment on resource-constrained devices.
The proposed distillation metric effectively compares different algorithms.
Survey highlights key challenges and future directions in the field.
Abstract
Deep learning based models are relatively large, and it is hard to deploy such models on resource-limited devices such as mobile phones and embedded devices. One possible solution is knowledge distillation whereby a smaller model (student model) is trained by utilizing the information from a larger model (teacher model). In this paper, we present a survey of knowledge distillation techniques applied to deep learning models. To compare the performances of different techniques, we propose a new metric called distillation metric. Distillation metric compares different knowledge distillation algorithms based on sizes and accuracy scores. Based on the survey, some interesting conclusions are drawn and presented in this paper.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
