Do the Frankenstein, or how to achieve better out-of-distribution performance with manifold mixing model soup
Hannes Fassold

TL;DR
This paper introduces the manifold mixing model soup, an algorithm that combines latent space manifolds of multiple finetuned models to improve out-of-distribution performance and accuracy on the original dataset.
Contribution
It proposes a novel manifold mixing approach to fuse models, enhancing out-of-distribution robustness beyond standard finetuning methods.
Findings
Out-of-distribution performance improved by +3.5%.
Fused model outperforms individual models on original dataset.
Method enhances robustness without sacrificing in-distribution accuracy.
Abstract
The standard recipe applied in transfer learning is to finetune a pretrained model on the task-specific dataset with different hyperparameter settings and pick the model with the highest accuracy on the validation dataset. Unfortunately, this leads to models which do not perform well under distribution shifts, e.g. when the model is given graphical sketches of the object as input instead of photos. In order to address this, we propose the manifold mixing model soup, an algorithm which mixes together the latent space manifolds of multiple finetuned models in an optimal way in order to generate a fused model. We show that the fused model gives significantly better out-of-distribution performance (+3.5 % compared to best individual model) when finetuning a CLIP model for image classification. In addition, it provides also better accuracy on the original dataset where the finetuning has…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Machine Learning and ELM
MethodsContrastive Language-Image Pre-training
