Unveiling the Potential of Probabilistic Embeddings in Self-Supervised Learning
Denis Janiak, Jakub Binkowski, Piotr Bielak, Tomasz Kajdanowicz

TL;DR
This paper explores the use of probabilistic embeddings in self-supervised learning, analyzing their impact on information compression, out-of-distribution detection, and the trade-offs involved within an information-theoretic framework.
Contribution
It introduces explicit stochastic modeling of embeddings in self-supervised learning and investigates their effects on performance and out-of-distribution detection from an information-theoretic perspective.
Findings
Probabilistic embeddings improve out-of-distribution detection.
Constraining representation or loss space affects information preservation.
Adding a bottleneck in the loss space enhances OOD detection capabilities.
Abstract
In recent years, self-supervised learning has played a pivotal role in advancing machine learning by allowing models to acquire meaningful representations from unlabeled data. An intriguing research avenue involves developing self-supervised models within an information-theoretic framework, but many studies often deviate from the stochasticity assumptions made when deriving their objectives. To gain deeper insights into this issue, we propose to explicitly model the representation with stochastic embeddings and assess their effects on performance, information compression and potential for out-of-distribution detection. From an information-theoretic perspective, we seek to investigate the impact of probabilistic modeling on the information bottleneck, shedding light on a trade-off between compression and preservation of information in both representation and loss space. Emphasizing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · AI in cancer detection · Domain Adaptation and Few-Shot Learning
