Self Regulated Learning Mechanism for Data Efficient Knowledge Distillation
Sourav Mishra, Suresh Sundaram

TL;DR
This paper introduces a significance-based self-regulated learning approach for data-efficient knowledge distillation, enabling models to achieve competitive performance using fewer training samples by focusing on significant data points.
Contribution
The work proposes a novel self-regulation mechanism that restricts sample participation based on significance, improving data efficiency in knowledge distillation.
Findings
Achieves similar performance to state-of-the-art methods with fewer samples
Uses significance measures to weigh sample contributions during distillation
Demonstrates effectiveness on benchmark datasets
Abstract
Existing methods for distillation do not efficiently utilize the training data. This work presents a novel approach to perform distillation using only a subset of the training data, making it more data-efficient. For this purpose, the training of the teacher model is modified to include self-regulation wherein a sample in the training set is used for updating model parameters in the backward pass either if it is misclassified or the model is not confident enough in its prediction. This modification restricts the participation of samples, unlike the conventional training method. The number of times a sample participates in the self-regulated training process is a measure of its significance towards the model's knowledge. The significance values are used to weigh the losses incurred on the corresponding samples in the distillation process. This method is named significance-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
