Model soups to increase inference without increasing compute time

Charles Dansereau; Milo Sobral; Maninder Bhogal; Mehdi Zalai

arXiv:2301.10092·cs.CV·January 25, 2023·1 cites

Model soups to increase inference without increasing compute time

Charles Dansereau, Milo Sobral, Maninder Bhogal, Mehdi Zalai

PDF

Open Access 1 Repo

TL;DR

This paper evaluates different model soup strategies to improve inference performance across various vision models, introducing a new pruned soup recipe that outperforms previous methods in certain cases.

Contribution

The paper introduces a new pruned soup recipe and compares multiple soup strategies across different models, highlighting limitations of weight-averaging in model soups.

Findings

01

Model soups improved performance for Vision Transformer models.

02

Pruned soup outperformed uniform and greedy soups in experiments.

03

Limitations of weight-averaging were identified during analysis.

Abstract

In this paper, we compare Model Soups performances on three different models (ResNet, ViT and EfficientNet) using three Soup Recipes (Greedy Soup Sorted, Greedy Soup Random and Uniform soup) from arXiv:2203.05482, and reproduce the results of the authors. We then introduce a new Soup Recipe called Pruned Soup. Results from the soups were better than the best individual model for the pre-trained vision transformer, but were much worst for the ResNet and the EfficientNet. Our pruned soup performed better than the uniform and greedy soups presented in the original paper. We also discuss the limitations of weight-averaging that were found during the experiments. The code for our model soup library and the experiments with different models can be found here: https://github.com/milo-sobral/ModelSoup

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

milo-sobral/modelsoup
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

MethodsLib · Model Soups · *Communicated@Fast*How Do I Communicate to Expedia? · Depthwise Convolution · Pointwise Convolution · Dense Connections · Kaiming Initialization · Depthwise Separable Convolution · Inverted Residual Block · Residual Connection