Self-Soupervision: Cooking Model Soups without Labels

Anthony Fuller; James R. Green; Evan Shelhamer

arXiv:2602.02890·cs.LG·February 4, 2026

Self-Soupervision: Cooking Model Soups without Labels

Anthony Fuller, James R. Green, Evan Shelhamer

PDF

Open Access

TL;DR

This paper introduces Self-Soupervision, a method that combines self-supervised learning models without labeled data to improve robustness and accuracy, by mixing diverse SSL ingredients and hyperparameters.

Contribution

It generalizes model soups to self-supervised learning, allowing mixing of different SSL algorithms and hyperparameters for enhanced robustness and performance.

Findings

01

Self-Souping boosts robustness on corrupted data.

02

Mixing diverse SSL ingredients improves accuracy.

03

Ingredients can differ in SSL hyperparameters and algorithms.

Abstract

Model soups are strange and strangely effective combinations of parameters. They take a model (the stock), fine-tune it into multiple models (the ingredients), and then mix their parameters back into one model (the soup) to improve predictions. While all known soups require supervised learning, and optimize the same loss on labeled data, our recipes for Self-\emph{Soup}ervision generalize soups to self-supervised learning (SSL). Our Self-Souping lets us flavor ingredients on new data sources, e.g. from unlabeled data from a task for transfer or from a shift for robustness. We show that Self-Souping on corrupted test data, then fine-tuning back on uncorrupted train data, boosts robustness by +3.5\% (ImageNet-C) and +7\% (LAION-C). Self-\emph{Soup}ervision also unlocks countless SSL algorithms to cook the diverse ingredients needed for more robust soups. We show for the first time that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis