Towards Democratizing Joint-Embedding Self-Supervised Learning

Florian Bordes; Randall Balestriero; Pascal Vincent

arXiv:2303.01986·cs.LG·March 6, 2023·6 cites

Towards Democratizing Joint-Embedding Self-Supervised Learning

Florian Bordes, Randall Balestriero, Pascal Vincent

PDF

Open Access 1 Repo

TL;DR

This paper challenges common misconceptions in joint-embedding self-supervised learning, demonstrating that simpler, less resource-intensive methods can achieve competitive results, and provides an optimized PyTorch library to facilitate broader research and evaluation.

Contribution

It debunks prevalent myths in JE-SSL, showing that effective representations can be learned with minimal data augmentation and resources, and introduces a user-friendly PyTorch library for easier experimentation.

Findings

01

Simpler data augmentation suffices for effective JE-SSL.

02

Training with a single negative example can still learn useful representations.

03

Misconceptions about batch size and augmentation requirements are largely unfounded.

Abstract

Joint Embedding Self-Supervised Learning (JE-SSL) has seen rapid developments in recent years, due to its promise to effectively leverage large unlabeled data. The development of JE-SSL methods was driven primarily by the search for ever increasing downstream classification accuracies, using huge computational resources, and typically built upon insights and intuitions inherited from a close parent JE-SSL method. This has led unwittingly to numerous pre-conceived ideas that carried over across methods e.g. that SimCLR requires very large mini batches to yield competitive accuracies; that strong and computationally slow data augmentations are required. In this work, we debunk several such ill-formed a priori ideas in the hope to unleash the full potential of JE-SSL free of unnecessary limitations. In fact, when carefully evaluating performances across different downstream tasks and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/ffcv-ssl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Mycobacterium research and diagnosis · Respiratory viral infections research

MethodsBitcoin Customer Service Number +1-833-534-1729 · Lib · 1x1 Convolution · Residual Block · Average Pooling · Residual Connection · Batch Normalization · Bottleneck Residual Block · Kaiming Initialization · Global Average Pooling