A Probabilistic Model Behind Self-Supervised Learning

Alice Bizeul; Bernhard Sch\"olkopf; Carl Allen

arXiv:2402.01399·cs.LG·October 16, 2024·1 cites

A Probabilistic Model Behind Self-Supervised Learning

Alice Bizeul, Bernhard Sch\"olkopf, Carl Allen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a probabilistic generative model for self-supervised learning, unifying various methods and demonstrating improved representation quality, especially in style-dependent tasks.

Contribution

It proposes a generative latent variable model that unifies different SSL approaches and introduces SimVAE, a generative method that enhances representation learning.

Findings

01

SimVAE outperforms existing SSL methods on simple benchmarks.

02

The model provides a theoretical framework linking SSL to mutual information.

03

SimVAE narrows the gap between generative and discriminative methods.

Abstract

In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels. A common task is to classify augmentations or different modalities of the data, which share semantic content (e.g. an object in an image) but differ in style (e.g. the object's location). Many approaches to self-supervised learning have been proposed, e.g. SimCLR, CLIP, and DINO, which have recently gained much attention for their representations achieving downstream performance comparable to supervised learning. However, a theoretical understanding of self-supervised methods eludes. Addressing this, we present a generative latent variable model for self-supervised learning and show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations, providing a unifying theoretical framework for these methods. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alicebizeul/simvae
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification

MethodsAttention Is All You Need · Linear Layer · Softmax · Multi-Head Attention · Residual Connection · Layer Normalization · Vision Transformer · self-DIstillation with NO labels · Average Pooling · Dense Connections