Self-supervised Knowledge Distillation Using Singular Value   Decomposition

Seung Hyun Lee; Dae Ha Kim; Byung Cheol Song

arXiv:1807.06819·cs.LG·July 19, 2018

Self-supervised Knowledge Distillation Using Singular Value Decomposition

Seung Hyun Lee, Dae Ha Kim, Byung Cheol Song

PDF

Open Access 3 Repos

TL;DR

This paper introduces a novel self-supervised knowledge distillation method using singular value decomposition (SVD) to improve the transfer of knowledge from a teacher to a student neural network, achieving better accuracy with less computation.

Contribution

It proposes a new SVD-based knowledge distillation technique and frames knowledge transfer as a self-supervised task, enhancing the effectiveness of student networks.

Findings

01

S-DNN with 1/5 the computational cost outperforms T-DNN by 1.1% in accuracy.

02

The method outperforms state-of-the-art distillation approaches by 1.79% at the same computational cost.

03

The approach effectively transfers knowledge, improving student network performance.

Abstract

To solve deep neural network (DNN)'s huge training dataset and its high computation issue, so-called teacher-student (T-S) DNN which transfers the knowledge of T-DNN to S-DNN has been proposed. However, the existing T-S-DNN has limited range of use, and the knowledge of T-DNN is insufficiently transferred to S-DNN. To improve the quality of the transferred knowledge from T-DNN, we propose a new knowledge distillation using singular value decomposition (SVD). In addition, we define a knowledge transfer as a self-supervised task and suggest a way to continuously receive information from T-DNN. Simulation results show that a S-DNN with a computational cost of 1/5 of the T-DNN can be up to 1.1\% better than the T-DNN in terms of classification accuracy. Also assuming the same computational cost, our S-DNN outperforms the S-DNN driven by the state-of-the-art distillation with a performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification

MethodsKnowledge Distillation