A New Training Framework for Deep Neural Network

Zhenyan Hou; Wenxuan Fan

arXiv:2103.07350·cs.LG·March 26, 2021·1 cites

A New Training Framework for Deep Neural Network

Zhenyan Hou, Wenxuan Fan

PDF

Open Access

TL;DR

This paper introduces Self Distillation, a new training framework that enables neural networks to learn from themselves without pre-trained teachers, reducing overheads while maintaining high performance across various tasks.

Contribution

The paper proposes a novel Self Distillation framework that eliminates the need for pre-trained teacher models in knowledge distillation, simplifying training and deployment.

Findings

01

Improved performance across multiple tasks and datasets

02

Reduces computational and storage overheads

03

Effective without pre-trained teacher models

Abstract

Knowledge distillation is the process of transferring the knowledge from a large model to a small model. In this process, the small model learns the generalization ability of the large model and retains the performance close to that of the large model. Knowledge distillation provides a training means to migrate the knowledge of models, facilitating model deployment and speeding up inference. However, previous distillation methods require pre-trained teacher models, which still bring computational and storage overheads. In this paper, a novel general training framework called Self Distillation (SD) is proposed. We demonstrate the effectiveness of our method by enumerating its performance improvements in diverse tasks and benchmark datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning

MethodsKnowledge Distillation · Label Smoothing