Self-Soupervision: Cooking Model Soups without Labels
Anthony Fuller, James R. Green, Evan Shelhamer

TL;DR
This paper introduces Self-Soupervision, a method that combines self-supervised learning models without labeled data to improve robustness and accuracy, by mixing diverse SSL ingredients and hyperparameters.
Contribution
It generalizes model soups to self-supervised learning, allowing mixing of different SSL algorithms and hyperparameters for enhanced robustness and performance.
Findings
Self-Souping boosts robustness on corrupted data.
Mixing diverse SSL ingredients improves accuracy.
Ingredients can differ in SSL hyperparameters and algorithms.
Abstract
Model soups are strange and strangely effective combinations of parameters. They take a model (the stock), fine-tune it into multiple models (the ingredients), and then mix their parameters back into one model (the soup) to improve predictions. While all known soups require supervised learning, and optimize the same loss on labeled data, our recipes for Self-\emph{Soup}ervision generalize soups to self-supervised learning (SSL). Our Self-Souping lets us flavor ingredients on new data sources, e.g. from unlabeled data from a task for transfer or from a shift for robustness. We show that Self-Souping on corrupted test data, then fine-tuning back on uncorrupted train data, boosts robustness by +3.5\% (ImageNet-C) and +7\% (LAION-C). Self-\emph{Soup}ervision also unlocks countless SSL algorithms to cook the diverse ingredients needed for more robust soups. We show for the first time that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
