Self Regulated Learning Mechanism for Data Efficient Knowledge   Distillation

Sourav Mishra; Suresh Sundaram

arXiv:2102.07125·cs.LG·April 26, 2021

Self Regulated Learning Mechanism for Data Efficient Knowledge Distillation

Sourav Mishra, Suresh Sundaram

PDF

TL;DR

This paper introduces a significance-based self-regulated learning approach for data-efficient knowledge distillation, enabling models to achieve competitive performance using fewer training samples by focusing on significant data points.

Contribution

The work proposes a novel self-regulation mechanism that restricts sample participation based on significance, improving data efficiency in knowledge distillation.

Findings

01

Achieves similar performance to state-of-the-art methods with fewer samples

02

Uses significance measures to weigh sample contributions during distillation

03

Demonstrates effectiveness on benchmark datasets

Abstract

Existing methods for distillation do not efficiently utilize the training data. This work presents a novel approach to perform distillation using only a subset of the training data, making it more data-efficient. For this purpose, the training of the teacher model is modified to include self-regulation wherein a sample in the training set is used for updating model parameters in the backward pass either if it is misclassified or the model is not confident enough in its prediction. This modification restricts the participation of samples, unlike the conventional training method. The number of times a sample participates in the self-regulated training process is a measure of its significance towards the model's knowledge. The significance values are used to weigh the losses incurred on the corresponding samples in the distillation process. This method is named significance-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation