Adaptive Teaching with Shared Classifier for Knowledge Distillation
Jaeyeon Jang, Young-Ik Kim, Jisu Lim, and Hyeonseong Lee

TL;DR
This paper introduces ATSC, an adaptive knowledge distillation method that dynamically aligns teacher networks with student learning needs using shared classifiers, achieving state-of-the-art results on CIFAR-100 and ImageNet.
Contribution
The paper proposes a novel adaptive teaching approach with shared classifiers that self-adjusts the teacher network for improved student learning, extending to multi-teacher environments.
Findings
Achieves state-of-the-art results on CIFAR-100 and ImageNet datasets.
Effective in both single-teacher and multi-teacher scenarios.
Requires only a modest increase in model parameters.
Abstract
Knowledge distillation (KD) is a technique used to transfer knowledge from an overparameterized teacher network to a less-parameterized student network, thereby minimizing the incurred performance loss. KD methods can be categorized into offline and online approaches. Offline KD leverages a powerful pretrained teacher network, while online KD allows the teacher network to be adjusted dynamically to enhance the learning effectiveness of the student network. Recently, it has been discovered that sharing the classifier of the teacher network can significantly boost the performance of the student network with only a minimal increase in the number of network parameters. Building on these insights, we propose adaptive teaching with a shared classifier (ATSC). In ATSC, the pretrained teacher network self-adjusts to better align with the learning needs of the student network based on its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Online Learning and Analytics
MethodsALIGN
