Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression
Cody Blakeney, Xiaomin Li, Yan Yan, Ziliang Zong

TL;DR
This paper introduces a parallel blockwise knowledge distillation method that significantly accelerates DNN compression, reducing training time and energy consumption while maintaining model accuracy.
Contribution
It proposes a novel parallel distillation algorithm using local information and depthwise separable layers to speed up complex DNN compression.
Findings
Achieves 3x speedup and 19-29% energy savings on VGG and ResNet distillation.
Maintains negligible accuracy loss during acceleration.
Further improves speedup to 3.87x using distributed GPU clusters.
Abstract
Deep neural networks (DNNs) have been extremely successful in solving many challenging AI tasks in natural language processing, speech recognition, and computer vision nowadays. However, DNNs are typically computation intensive, memory demanding, and power hungry, which significantly limits their usage on platforms with constrained resources. Therefore, a variety of compression techniques (e.g. quantization, pruning, and knowledge distillation) have been proposed to reduce the size and power consumption of DNNs. Blockwise knowledge distillation is one of the compression techniques that can effectively reduce the size of a highly complex DNN. However, it is not widely adopted due to its long training time. In this paper, we propose a novel parallel blockwise distillation algorithm to accelerate the distillation process of sophisticated DNNs. Our algorithm leverages local information to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsKnowledge Distillation · Dropout · Softmax · 1x1 Convolution · Convolution · Dense Connections · Max Pooling · Kaiming Initialization · Ethereum Customer Service Number +1-833-534-1729 · *Communicated@Fast*How Do I Communicate to Expedia?
