Slimmable Networks for Contrastive Self-supervised Learning

Shuai Zhao; Linchao Zhu; Xiaohan Wang; Yi Yang

arXiv:2209.15525·cs.CV·July 30, 2024·1 cites

Slimmable Networks for Contrastive Self-supervised Learning

Shuai Zhao, Linchao Zhu, Xiaohan Wang, Yi Yang

PDF

Open Access 1 Repo

TL;DR

This paper introduces SlimCLR, a one-stage, slimmable network approach for contrastive self-supervised learning that eliminates the need for teacher models, addressing performance issues in small models through novel training techniques.

Contribution

The paper proposes SlimCLR, a novel slimmable network framework for contrastive self-supervised learning, with techniques to improve training stability and performance of small models without extra teachers.

Findings

01

SlimCLR outperforms previous methods with fewer parameters and FLOPs.

02

The introduced techniques stabilize training and improve small model performance.

03

Theoretical analysis shows switchable linear layers are more effective during evaluation.

Abstract

Self-supervised learning makes significant progress in pre-training large models, but struggles with small models. Mainstream solutions to this problem rely mainly on knowledge distillation, which involves a two-stage procedure: first training a large teacher model and then distilling it to improve the generalization ability of smaller ones. In this work, we introduce another one-stage solution to obtain pre-trained small models without the need for extra teachers, namely, slimmable networks for contrastive self-supervised learning (SlimCLR). A slimmable network consists of a full network and several weight-sharing sub-networks, which can be pre-trained once to obtain various networks, including small ones with low computation costs. However, interference between weight-sharing networks leads to severe performance degradation in self-supervised cases, as evidenced by gradient magnitude…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mzhaoshuai/slimclr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Speech and Audio Processing

MethodsLinear Layer · Knowledge Distillation · Contrastive Learning