Black-box Few-shot Knowledge Distillation
Dang Nguyen, Sunil Gupta, Kien Do, Svetha Venkatesh

TL;DR
This paper introduces a black-box few-shot knowledge distillation method that generates synthetic data to effectively transfer knowledge from a black-box teacher to a student with minimal unlabeled samples.
Contribution
It proposes a novel approach combining MixUp and a conditional VAE to generate diverse synthetic images for effective black-box KD with limited data.
Findings
Outperforms recent state-of-the-art few/zero-shot KD methods
Significantly improves image classification accuracy
Demonstrates effectiveness with minimal unlabeled data
Abstract
Knowledge distillation (KD) is an efficient approach to transfer the knowledge from a large "teacher" network to a smaller "student" network. Traditional KD methods require lots of labeled training samples and a white-box teacher (parameters are accessible) to train a good student. However, these resources are not always available in real-world applications. The distillation process often happens at an external party side where we do not have access to much data, and the teacher does not disclose its parameters due to security and privacy concerns. To overcome these challenges, we propose a black-box few-shot KD method to train the student with few unlabeled training samples and a black-box teacher. Our main idea is to expand the training set by generating a diverse set of out-of-distribution synthetic images using MixUp and a conditional variational auto-encoder. These synthetic images…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Advanced Neural Network Applications
MethodsMixup
