Zero-Shot Knowledge Distillation in Deep Networks

Gaurav Kumar Nayak; Konda Reddy Mopuri; Vaisakh Shaj; R. Venkatesh; Babu; Anirban Chakraborty

arXiv:1905.08114·cs.LG·May 21, 2019·85 cites

Zero-Shot Knowledge Distillation in Deep Networks

Gaurav Kumar Nayak, Konda Reddy Mopuri, Vaisakh Shaj, R. Venkatesh, Babu, Anirban Chakraborty

PDF

Open Access 1 Repo

TL;DR

This paper introduces Zero-Shot Knowledge Distillation, a novel data-free approach that synthesizes surrogate data from a complex teacher model to train smaller student models, addressing privacy and data access issues.

Contribution

The paper proposes a new data-free method for knowledge distillation that synthesizes Data Impressions from the teacher model without using any real training data or meta-data.

Findings

01

Achieves competitive performance compared to traditional data-dependent distillation.

02

Effective on multiple benchmark datasets.

03

Addresses privacy concerns by eliminating the need for original training data.

Abstract

Knowledge distillation deals with the problem of training a smaller model (Student) from a high capacity source model (Teacher) so as to retain most of its performance. Existing approaches use either the training data or meta-data extracted from it in order to train the Student. However, accessing the dataset on which the Teacher has been trained may not always be feasible if the dataset is very large or it poses privacy or safety concerns (e.g., bio-metric or medical data). Hence, in this paper, we propose a novel data-free method to train the Student from the Teacher. Without even using any meta-data, we synthesize the Data Impressions from the complex Teacher model and utilize these as surrogates for the original training data samples to transfer its learning to Student via knowledge distillation. We, therefore, dub our method "Zero-Shot Knowledge Distillation" and demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vcl-iisc/ZSKD
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications