Dual Discriminator Adversarial Distillation for Data-free Model   Compression

Haoran Zhao; Xin Sun; Junyu Dong; Hui Yu; Huiyu Zhou

arXiv:2104.05382·cs.CV·October 6, 2021·1 cites

Dual Discriminator Adversarial Distillation for Data-free Model Compression

Haoran Zhao, Xin Sun, Junyu Dong, Hui Yu, Huiyu Zhou

PDF

Open Access

TL;DR

This paper introduces a data-free knowledge distillation method called Dual Discriminator Adversarial Distillation (DDAD) that creates synthetic data to train compact neural networks without access to original training data.

Contribution

The paper proposes a novel data-free distillation approach using dual discriminator adversarial training to generate synthetic data for effective model compression.

Findings

01

Outperforms existing data-free distillation methods on classification benchmarks.

02

Effective for semantic segmentation tasks on multiple datasets.

03

Produces compact models closely matching teacher performance without original data.

Abstract

Knowledge distillation has been widely used to produce portable and efficient neural networks which can be well applied on edge devices for computer vision tasks. However, almost all top-performing knowledge distillation methods need to access the original training data, which usually has a huge size and is often unavailable. To tackle this problem, we propose a novel data-free approach in this paper, named Dual Discriminator Adversarial Distillation (DDAD) to distill a neural network without any training data or meta-data. To be specific, we use a generator to create samples through dual discriminator adversarial distillation, which mimics the original training data. The generator not only uses the pre-trained teacher's intrinsic statistics in existing batch normalization layers but also obtains the maximum discrepancy from the student model. Then the generated samples are used to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications

MethodsKnowledge Distillation · Batch Normalization