Unified and Effective Ensemble Knowledge Distillation

Chuhan Wu; Fangzhao Wu; Tao Qi; Yongfeng Huang

arXiv:2204.00548·cs.LG·April 4, 2022·1 cites

Unified and Effective Ensemble Knowledge Distillation

Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang

PDF

Open Access

TL;DR

This paper introduces a novel ensemble knowledge distillation approach that leverages both labeled and unlabeled data, weighting teacher predictions by correctness and disagreement to improve student model performance.

Contribution

It proposes a unified method that effectively distills knowledge from multiple teachers on unlabeled data by weighting based on prediction correctness and disagreement.

Findings

01

Improves student model accuracy across four datasets.

02

Effectively utilizes unlabeled data for knowledge transfer.

03

Outperforms existing ensemble distillation methods.

Abstract

Ensemble knowledge distillation can extract knowledge from multiple teacher models and encode it into a single student model. Many existing methods learn and distill the student model on labeled data only. However, the teacher models are usually learned on the same labeled data, and their predictions have high correlations with groudtruth labels. Thus, they cannot provide sufficient knowledge complementary to task labels for student teaching. Distilling on unseen unlabeled data has the potential to enhance the knowledge transfer from the teachers to the student. In this paper, we propose a unified and effective ensemble knowledge distillation method that distills a single student model from an ensemble of teacher models on both labeled and unlabeled data. Since different teachers may have diverse prediction correctness on the same sample, on labeled data we weight the predictions of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Online Learning and Analytics · Domain Adaptation and Few-Shot Learning

MethodsKnowledge Distillation