CES-KD: Curriculum-based Expert Selection for Guided Knowledge   Distillation

Ibtihel Amara; Maryam Ziaeefard; Brett H. Meyer; Warren Gross and; James J. Clark

arXiv:2209.07606·cs.CV·September 19, 2022·1 cites

CES-KD: Curriculum-based Expert Selection for Guided Knowledge Distillation

Ibtihel Amara, Maryam Ziaeefard, Brett H. Meyer, Warren Gross and, James J. Clark

PDF

Open Access

TL;DR

CES-KD introduces a curriculum-based expert selection method that gradually guides a student network using stratified teachers based on data difficulty, improving knowledge distillation performance across multiple datasets and architectures.

Contribution

The paper proposes CES-KD, a novel curriculum-driven expert selection technique for knowledge distillation that dynamically chooses teachers based on data difficulty to better bridge the capacity gap.

Findings

01

CES-KD improves accuracy on various architectures.

02

The method is effective across multiple datasets.

03

Gradual teacher selection enhances student learning.

Abstract

Knowledge distillation (KD) is an effective tool for compressing deep classification models for edge devices. However, the performance of KD is affected by the large capacity gap between the teacher and student networks. Recent methods have resorted to a multiple teacher assistant (TA) setting for KD, which sequentially decreases the size of the teacher model to relatively bridge the size gap between these models. This paper proposes a new technique called Curriculum Expert Selection for Knowledge Distillation (CES-KD) to efficiently enhance the learning of a compact student under the capacity gap problem. This technique is built upon the hypothesis that a student network should be guided gradually using stratified teaching curriculum as it learns easy (hard) data samples better and faster from a lower (higher) capacity teacher network. Specifically, our method is a gradual TA-based KD…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsKnowledge Distillation