A Probabilistic Approach to Self-Supervised Learning using Cyclical Stochastic Gradient MCMC
Masoumeh Javanbakhat, Christoph Lippert

TL;DR
This paper introduces a Bayesian self-supervised learning method using cyclical stochastic gradient Hamiltonian Monte Carlo to produce diverse, interpretable embeddings that improve performance and out-of-distribution detection.
Contribution
The paper presents a novel Bayesian approach with cSGHMC for self-supervised learning, enabling exploration of expressive posterior distributions over embeddings.
Findings
Improved performance on multiple classification datasets.
Enhanced calibration and out-of-distribution detection.
Demonstrated effectiveness on SVHN and CIFAR-10 datasets.
Abstract
In this paper we present a practical Bayesian self-supervised learning method with Cyclical Stochastic Gradient Hamiltonian Monte Carlo (cSGHMC). Within this framework, we place a prior over the parameters of a self-supervised learning model and use cSGHMC to approximate the high dimensional and multimodal posterior distribution over the embeddings. By exploring an expressive posterior over the embeddings, Bayesian self-supervised learning produces interpretable and diverse representations. Marginalizing over these representations yields a significant gain in performance, calibration and out-of-distribution detection on a variety of downstream classification tasks. We provide experimental results on multiple classification tasks on four challenging datasets. Moreover, we demonstrate the effectiveness of the proposed method in out-of-distribution detection using the SVHN and CIFAR-10…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
