Activation Map Adaptation for Effective Knowledge Distillation

Zhiyuan Wu; Hong Qi; Yu Jiang; Minghao Zhao; Chupeng Cui; Zongmin Yang; and Xinhui Xue

arXiv:2010.13500·cs.CV·April 15, 2022

Activation Map Adaptation for Effective Knowledge Distillation

Zhiyuan Wu, Hong Qi, Yu Jiang, Minghao Zhao, Chupeng Cui, Zongmin Yang, and Xinhui Xue

PDF

Open Access

TL;DR

This paper introduces an activation map adaptation method for knowledge distillation that improves student network accuracy and training speed by adaptively selecting supervisory features during training.

Contribution

It proposes a novel activation map adaptive module to enhance knowledge transfer in neural network compression, improving accuracy and efficiency.

Findings

01

Boosts student network accuracy by 0.6%

02

Reduces training loss by 6.5%

03

Speeds up training process

Abstract

Model compression becomes a recent trend due to the requirement of deploying neural networks on embedded and mobile devices. Hence, both accuracy and efficiency are of critical importance. To explore a balance between them, a knowledge distillation strategy is proposed for general visual representation learning. It utilizes our well-designed activation map adaptive module to replace some blocks of the teacher network, exploring the most appropriate supervisory features adaptively during the training process. Using the teacher's hidden layer output to prompt the student network to train so as to transfer effective semantic information.To verify the effectiveness of our strategy, this paper applied our method to cifar-10 dataset. Results demonstrate that the method can boost the accuracy of the student network by 0.6% with 6.5% loss reduction, and significantly improve its training speed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition

MethodsKnowledge Distillation