Distribution Shift Matters for Knowledge Distillation with Webly   Collected Images

Jialiang Tang; Shuo Chen; Gang Niu; Masashi Sugiyama; Chen Gong

arXiv:2307.11469·cs.CV·July 24, 2023

Distribution Shift Matters for Knowledge Distillation with Webly Collected Images

Jialiang Tang, Shuo Chen, Gang Niu, Masashi Sugiyama, Chen Gong

PDF

Open Access 1 Video

TL;DR

This paper introduces KD$^{3}$, a novel data-free knowledge distillation method that addresses distribution shift issues in webly collected data by instance selection, feature alignment, and distribution-invariant learning, improving performance.

Contribution

The paper proposes KD$^{3}$, a new approach that effectively handles distribution shifts in webly collected data for knowledge distillation, enhancing model reliability without original training data.

Findings

01

KD$^{3}$ outperforms existing data-free methods on benchmark datasets.

02

The method effectively mitigates distribution shift impacts.

03

Experimental results show improved student network performance.

Abstract

Knowledge distillation aims to learn a lightweight student network from a pre-trained teacher network. In practice, existing knowledge distillation methods are usually infeasible when the original training data is unavailable due to some privacy issues and data management considerations. Therefore, data-free knowledge distillation approaches proposed to collect training instances from the Internet. However, most of them have ignored the common distribution shift between the instances from original training data and webly collected data, affecting the reliability of the trained student network. To solve this problem, we propose a novel method dubbed ``Knowledge Distillation between Different Distributions" (KD $^{3}$ ), which consists of three components. Specifically, we first dynamically select useful training instances from the webly collected data according to the combined predictions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Distribution Shift Matters for Knowledge Distillation with Webly Collected Images· youtube

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · AI in cancer detection

MethodsKnowledge Distillation · Contrastive Learning · ALIGN