Establishing a stronger baseline for lightweight contrastive models

Wenye Lin; Yifeng Ding; Zhixiong Cao; Hai-tao Zheng

arXiv:2212.07158·cs.CV·July 18, 2023

Establishing a stronger baseline for lightweight contrastive models

Wenye Lin, Yifeng Ding, Zhixiong Cao, Hai-tao Zheng

PDF

Open Access 1 Repo

TL;DR

This paper improves lightweight contrastive learning models by optimizing training settings and introducing a smoothed loss, achieving performance close to larger models without needing a pretrained teacher.

Contribution

It establishes a stronger baseline for lightweight contrastive models by tailoring training recipes and proposing a smoothed InfoNCE loss, eliminating the need for pretrained teacher models.

Findings

01

Significant accuracy improvements on ImageNet for MobileNet-V3-Large and EfficientNet-B0.

02

Achieved close-to-resNet50 performance with 5x fewer parameters.

03

Proposed a smoothed InfoNCE loss to reduce noise in contrastive learning.

Abstract

Recent research has reported a performance degradation in self-supervised contrastive learning for specially designed efficient networks, such as MobileNet and EfficientNet. A common practice to address this problem is to introduce a pretrained contrastive teacher model and train the lightweight networks with distillation signals generated by the teacher. However, it is time and resource consuming to pretrain a teacher model when it is not available. In this work, we aim to establish a stronger baseline for lightweight contrastive models without using a pretrained teacher model. Specifically, we show that the optimal recipe for efficient models is different from that of larger models, and using the same training settings as ResNet50, as previous research does, is inappropriate. Additionally, we observe a common issu e in contrastive learning where either the positive or negative views…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

linwenye/light-moco
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Video Surveillance and Tracking Methods · Advanced Neural Network Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Pointwise Convolution · Depthwise Convolution · Depthwise Separable Convolution · Average Pooling · Dense Connections · Batch Normalization · Sigmoid Activation · 1x1 Convolution · Convolution