Enhancing Data-Free Adversarial Distillation with Activation Regularization and Virtual Interpolation
Xiaoyang Qu, Jianzong Wang, Jing Xiao

TL;DR
This paper introduces activation regularization and virtual interpolation techniques to improve data-free adversarial knowledge distillation, significantly enhancing the efficiency and accuracy of student models without access to original training data.
Contribution
It proposes novel activation regularizer and virtual interpolation methods that boost data generation efficiency and model performance in data-free distillation.
Findings
Achieves 95.42% accuracy on CIFAR-10
Achieves 77.05% accuracy on CIFAR-100
Outperforms state-of-the-art data-free distillation methods
Abstract
Knowledge distillation refers to a technique of transferring the knowledge from a large learned model or an ensemble of learned models to a small model. This method relies on access to the original training set, which might not always be available. A possible solution is a data-free adversarial distillation framework, which deploys a generative network to transfer the teacher model's knowledge to the student model. However, the data generation efficiency is low in the data-free adversarial distillation. We add an activation regularizer and a virtual interpolation method to improve the data generation efficiency. The activation regularizer enables the students to match the teacher's predictions close to activation boundaries and decision boundaries. The virtual interpolation method can generate virtual samples and labels in-between decision boundaries. Our experiments show that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
