Teaching What You Should Teach: A Data-Based Distillation Method
Shitong Shao, Huanran Chen, Zhen Huang, Linrui Gong, Shuai, Wang, Xinxiao Wu

TL;DR
This paper introduces TST, a data-based knowledge distillation method that uses a neural network augmentation module to generate targeted samples, improving student model generalization across multiple vision tasks.
Contribution
The paper proposes a novel data augmentation approach within knowledge distillation that dynamically finds samples aligning with teacher strengths and student weaknesses, enhancing distillation efficiency.
Findings
Achieves state-of-the-art results on CIFAR-10, ImageNet-1k, MS-COCO, and Cityscapes.
Effective across object recognition, detection, and segmentation tasks.
Visualization studies reveal optimal magnitudes and probabilities for distillation.
Abstract
In real teaching scenarios, an excellent teacher always teaches what he (or she) is good at but the student is not. This gives the student the best assistance in making up for his (or her) weaknesses and becoming a good one overall. Enlightened by this, we introduce the "Teaching what you Should Teach" strategy into a knowledge distillation framework, and propose a data-based distillation method named "TST" that searches for desirable augmented samples to assist in distilling more efficiently and rationally. To be specific, we design a neural network-based data augmentation module with priori bias, which assists in finding what meets the teacher's strengths but the student's weaknesses, by learning magnitudes and probabilities to generate suitable data samples. By training the data augmentation module and the generalized distillation paradigm in turn, a student model is learned with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
MethodsKnowledge Distillation
