Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

Guodong Xu; Ziwei Liu; Chen Change Loy

arXiv:2012.09413·cs.CV·December 18, 2020·6 cites

Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

Guodong Xu, Ziwei Liu, Chen Change Loy

PDF

Open Access 1 Repo

TL;DR

This paper introduces UNIX, an uncertainty-aware mixup method that improves knowledge distillation efficiency by reducing training computation while maintaining or enhancing performance on CIFAR100 and ImageNet.

Contribution

The paper proposes UNIX, a novel approach combining uncertainty sampling and mixup to reduce redundancy and computational cost in knowledge distillation.

Findings

01

Outperforms conventional methods on CIFAR100 with 21% less computation

02

Achieves comparable results to traditional distillation on ImageNet

03

Reduces training redundancy by focusing on informative samples

Abstract

Knowledge distillation, which involves extracting the "dark knowledge" from a teacher network to guide the learning of a student network, has emerged as an essential technique for model compression and transfer learning. Unlike previous works that focus on the accuracy of student network, here we study a little-explored but important question, i.e., knowledge distillation efficiency. Our goal is to achieve a performance comparable to conventional knowledge distillation with a lower computation cost during training. We show that the UNcertainty-aware mIXup (UNIX) can serve as a clean yet effective solution. The uncertainty sampling strategy is used to evaluate the informativeness of each training sample. Adaptive mixup is applied to uncertain samples to compact knowledge. We further show that the redundancy of conventional knowledge distillation lies in the excessive learning of easy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xuguodong03/UNIXKD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning

MethodsKnowledge Distillation · Mixup