Understanding the Role of Mixup in Knowledge Distillation: An Empirical   Study

Hongjun Choi; Eun Som Jeon; Ankita Shukla; Pavan Turaga

arXiv:2211.03946·cs.CV·November 10, 2022·1 cites

Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study

Hongjun Choi, Eun Som Jeon, Ankita Shukla, Pavan Turaga

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper empirically investigates how mixup data augmentation influences knowledge distillation, revealing the importance of smoothness and proposing strategies to improve student network training.

Contribution

It provides a detailed empirical analysis of the compatibility between mixup and knowledge distillation, highlighting the role of smoothness and suggesting improved training strategies.

Findings

01

Smoothness links mixup and KD.

02

Mixup enhances KD effectiveness.

03

Proposed strategies improve student network performance.

Abstract

Mixup is a popular data augmentation technique based on creating new samples by linear interpolation between two given data samples, to improve both the generalization and robustness of the trained model. Knowledge distillation (KD), on the other hand, is widely used for model compression and transfer learning, which involves using a larger network's implicit knowledge to guide the learning of a smaller network. At first glance, these two techniques seem very different, however, we found that "smoothness" is the connecting link between the two and is also a crucial attribute in understanding KD's interplay with mixup. Although many mixup variants and distillation methods have been proposed, much remains to be understood regarding the role of a mixup in knowledge distillation. In this paper, we present a detailed empirical study on various important dimensions of compatibility between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hchoi71/mix-kd
pytorchOfficial

Videos

Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study· youtube

Taxonomy

TopicsAI in cancer detection · Machine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis

MethodsMixup · Knowledge Distillation