Benchmark for Uncertainty & Robustness in Self-Supervised Learning
Ha Manh Bui, Iliana Maifeld-Carucci

TL;DR
This paper establishes a benchmark for evaluating uncertainty estimation and robustness in self-supervised learning methods across vision and language tasks, addressing a critical gap in assessing model reliability in high-stakes applications.
Contribution
It introduces a comprehensive benchmark evaluating SSL methods on generalization and uncertainty under distributional shifts, with experimental results and open-source code for reproducibility.
Findings
SSL methods vary in robustness and uncertainty estimation
Benchmark provides standardized evaluation across datasets
Results highlight need for improved SSL reliability measures
Abstract
Self-Supervised Learning (SSL) is crucial for real-world applications, especially in data-hungry domains such as healthcare and self-driving cars. In addition to a lack of labeled data, these applications also suffer from distributional shifts. Therefore, an SSL method should provide robust generalization and uncertainty estimation in the test dataset to be considered a reliable model in such high-stakes domains. However, existing approaches often focus on generalization, without evaluating the model's uncertainty. The ability to compare SSL techniques for improving these estimates is therefore critical for research on the reliability of self-supervision models. In this paper, we explore variants of SSL methods, including Jigsaw Puzzles, Context, Rotation, Geometric Transformations Prediction for vision, as well as BERT and GPT for language tasks. We train SSL in auxiliary learning for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Digital Imaging for Blood Diseases · AI in cancer detection
MethodsAttention Is All You Need · Test · Cosine Annealing · Weight Decay · Linear Warmup With Linear Decay · Linear Layer · Linear Warmup With Cosine Annealing · Softmax · WordPiece · Layer Normalization
