Unified and Effective Ensemble Knowledge Distillation
Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang

TL;DR
This paper introduces a novel ensemble knowledge distillation approach that leverages both labeled and unlabeled data, weighting teacher predictions by correctness and disagreement to improve student model performance.
Contribution
It proposes a unified method that effectively distills knowledge from multiple teachers on unlabeled data by weighting based on prediction correctness and disagreement.
Findings
Improves student model accuracy across four datasets.
Effectively utilizes unlabeled data for knowledge transfer.
Outperforms existing ensemble distillation methods.
Abstract
Ensemble knowledge distillation can extract knowledge from multiple teacher models and encode it into a single student model. Many existing methods learn and distill the student model on labeled data only. However, the teacher models are usually learned on the same labeled data, and their predictions have high correlations with groudtruth labels. Thus, they cannot provide sufficient knowledge complementary to task labels for student teaching. Distilling on unseen unlabeled data has the potential to enhance the knowledge transfer from the teachers to the student. In this paper, we propose a unified and effective ensemble knowledge distillation method that distills a single student model from an ensemble of teacher models on both labeled and unlabeled data. Since different teachers may have diverse prediction correctness on the same sample, on labeled data we weight the predictions of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Online Learning and Analytics · Domain Adaptation and Few-Shot Learning
MethodsKnowledge Distillation
