Do the Frankenstein, or how to achieve better out-of-distribution   performance with manifold mixing model soup

Hannes Fassold

arXiv:2309.08610·cs.LG·September 19, 2023

Do the Frankenstein, or how to achieve better out-of-distribution performance with manifold mixing model soup

Hannes Fassold

PDF

Open Access

TL;DR

This paper introduces the manifold mixing model soup, an algorithm that combines latent space manifolds of multiple finetuned models to improve out-of-distribution performance and accuracy on the original dataset.

Contribution

It proposes a novel manifold mixing approach to fuse models, enhancing out-of-distribution robustness beyond standard finetuning methods.

Findings

01

Out-of-distribution performance improved by +3.5%.

02

Fused model outperforms individual models on original dataset.

03

Method enhances robustness without sacrificing in-distribution accuracy.

Abstract

The standard recipe applied in transfer learning is to finetune a pretrained model on the task-specific dataset with different hyperparameter settings and pick the model with the highest accuracy on the validation dataset. Unfortunately, this leads to models which do not perform well under distribution shifts, e.g. when the model is given graphical sketches of the object as input instead of photos. In order to address this, we propose the manifold mixing model soup, an algorithm which mixes together the latent space manifolds of multiple finetuned models in an optimal way in order to generate a fused model. We show that the fused model gives significantly better out-of-distribution performance (+3.5 % compared to best individual model) when finetuning a CLIP model for image classification. In addition, it provides also better accuracy on the original dataset where the finetuning has…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Machine Learning and ELM

MethodsContrastive Language-Image Pre-training