SEED: Self-supervised Distillation For Visual Representation

Zhiyuan Fang; Jianfeng Wang; Lijuan Wang; Lei Zhang; Yezhou Yang,; Zicheng Liu

arXiv:2101.04731·cs.CV·April 19, 2021·74 cites

SEED: Self-supervised Distillation For Visual Representation

Zhiyuan Fang, Jianfeng Wang, Lijuan Wang, Lei Zhang, Yezhou Yang,, Zicheng Liu

PDF

Open Access 1 Repo 1 Video

TL;DR

SEED introduces a self-supervised distillation approach where a large teacher network transfers knowledge to a smaller student network, significantly improving small model performance on image classification tasks.

Contribution

The paper proposes SEED, a novel self-supervised distillation method that enhances small model performance by transferring knowledge from larger models without labeled data.

Findings

01

SEED boosts small model accuracy from 42.2% to 67.6% on EfficientNet-B0.

02

SEED improves MobileNet-v3-Large accuracy from 36.3% to 68.2%.

03

Significant performance gains over baseline self-supervised methods.

Abstract

This paper is concerned with self-supervised learning for small models. The problem is motivated by our empirical studies that while the widely used contrastive self-supervised learning method has shown great progress on large model training, it does not work well for small models. To address this problem, we propose a new learning paradigm, named SElf-SupErvised Distillation (SEED), where we leverage a larger network (as Teacher) to transfer its representational knowledge into a smaller architecture (as Student) in a self-supervised fashion. Instead of directly learning from unlabeled data, we train a student encoder to mimic the similarity score distribution inferred by a teacher over a set of instances. We show that SEED dramatically boosts the performance of small networks on downstream tasks. Compared with self-supervised baselines, SEED improves the top-1 accuracy from 42.2% to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jacobswan1/SEED
pytorchOfficial

Videos

SEED: Self-supervised Distillation For Visual Representation· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques

MethodsDense Connections · Random Gaussian Blur · InfoNCE · Feedforward Network · MoCo v2