Knowledge Distillation in Deep Learning and its Applications

Abdolmaged Alkhulaifi; Fahad Alsahli; Irfan Ahmad

arXiv:2007.09029·cs.LG·May 21, 2021

Knowledge Distillation in Deep Learning and its Applications

Abdolmaged Alkhulaifi, Fahad Alsahli, Irfan Ahmad

PDF

TL;DR

This paper surveys knowledge distillation techniques in deep learning, introduces a new metric for comparing methods, and discusses their effectiveness in deploying smaller models on resource-limited devices.

Contribution

It provides a comprehensive survey of knowledge distillation methods and proposes a novel metric for performance comparison.

Findings

01

Knowledge distillation improves model deployment on resource-constrained devices.

02

The proposed distillation metric effectively compares different algorithms.

03

Survey highlights key challenges and future directions in the field.

Abstract

Deep learning based models are relatively large, and it is hard to deploy such models on resource-limited devices such as mobile phones and embedded devices. One possible solution is knowledge distillation whereby a smaller model (student model) is trained by utilizing the information from a larger model (teacher model). In this paper, we present a survey of knowledge distillation techniques applied to deep learning models. To compare the performances of different techniques, we propose a new metric called distillation metric. Distillation metric compares different knowledge distillation algorithms based on sizes and accuracy scores. Based on the survey, some interesting conclusions are drawn and presented in this paper.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation