Efficient training of lightweight neural networks using Online Self-Acquired Knowledge Distillation
Maria Tzelepi, Anastasios Tefas

TL;DR
This paper introduces Online Self-Acquired Knowledge Distillation (OSAKD), a method that enhances lightweight neural network training by estimating class probabilities directly in feature space, reducing computational costs.
Contribution
The paper proposes a novel online knowledge distillation approach using non-parametric density estimation to improve model performance efficiently.
Findings
Effective on four datasets
Reduces computational cost compared to traditional KD
Improves accuracy of lightweight models
Abstract
Knowledge Distillation has been established as a highly promising approach for training compact and faster models by transferring knowledge from heavyweight and powerful models. However, KD in its conventional version constitutes an enduring, computationally and memory demanding process. In this paper, Online Self-Acquired Knowledge Distillation (OSAKD) is proposed, aiming to improve the performance of any deep neural model in an online manner. We utilize k-nn non-parametric density estimation technique for estimating the unknown probability distributions of the data samples in the output feature space. This allows us for directly estimating the posterior class probabilities of the data samples, and we use them as soft labels that encode explicit information about the similarities of the data with the classes, negligibly affecting the computational cost. The experimental evaluation on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation · k-Nearest Neighbors
