Hybrid Data-Free Knowledge Distillation
Jialiang Tang, Shuo Chen, Chen Gong

TL;DR
This paper introduces HiDFD, a data-free knowledge distillation method that effectively uses minimal real data and synthetic examples to train compact student networks, outperforming existing approaches.
Contribution
The novel HiDFD framework combines teacher-guided GAN generation with data inflation and feature alignment, enabling effective knowledge distillation with significantly less real data.
Findings
Achieves state-of-the-art performance with 120x less real data.
Uses a feature integration mechanism to prevent GAN overfitting.
Employs a category frequency smoothing technique for balanced generation.
Abstract
Data-free knowledge distillation aims to learn a compact student network from a pre-trained large teacher network without using the original training data of the teacher network. Existing collection-based and generation-based methods train student networks by collecting massive real examples and generating synthetic examples, respectively. However, they inevitably become weak in practical scenarios due to the difficulties in gathering or emulating sufficient real-world data. To solve this problem, we propose a novel method called \textbf{H}ybr\textbf{i}d \textbf{D}ata-\textbf{F}ree \textbf{D}istillation (HiDFD), which leverages only a small amount of collected data as well as generates sufficient examples for training student networks. Our HiDFD comprises two primary modules, \textit{i.e.}, the teacher-guided generation and student distillation. The teacher-guided generation module…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications
MethodsKnowledge Distillation
