Adaptive Teaching with Shared Classifier for Knowledge Distillation

Jaeyeon Jang; Young-Ik Kim; Jisu Lim; and Hyeonseong Lee

arXiv:2406.08528·cs.CV·June 17, 2024

Adaptive Teaching with Shared Classifier for Knowledge Distillation

Jaeyeon Jang, Young-Ik Kim, Jisu Lim, and Hyeonseong Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces ATSC, an adaptive knowledge distillation method that dynamically aligns teacher networks with student learning needs using shared classifiers, achieving state-of-the-art results on CIFAR-100 and ImageNet.

Contribution

The paper proposes a novel adaptive teaching approach with shared classifiers that self-adjusts the teacher network for improved student learning, extending to multi-teacher environments.

Findings

01

Achieves state-of-the-art results on CIFAR-100 and ImageNet datasets.

02

Effective in both single-teacher and multi-teacher scenarios.

03

Requires only a modest increase in model parameters.

Abstract

Knowledge distillation (KD) is a technique used to transfer knowledge from an overparameterized teacher network to a less-parameterized student network, thereby minimizing the incurred performance loss. KD methods can be categorized into offline and online approaches. Offline KD leverages a powerful pretrained teacher network, while online KD allows the teacher network to be adjusted dynamically to enhance the learning effectiveness of the student network. Recently, it has been discovered that sharing the classifier of the teacher network can significantly boost the performance of the student network with only a minimal increase in the number of network parameters. Building on these insights, we propose adaptive teaching with a shared classifier (ATSC). In ATSC, the pretrained teacher network self-adjusts to better align with the learning needs of the student network based on its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

random2314235/atsc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Online Learning and Analytics

MethodsALIGN