PopulAtion Parameter Averaging (PAPA)

Alexia Jolicoeur-Martineau; Emy Gervais; Kilian Fatras; Yan Zhang,; Simon Lacoste-Julien

arXiv:2304.03094·cs.LG·May 7, 2024·1 cites

PopulAtion Parameter Averaging (PAPA)

Alexia Jolicoeur-Martineau, Emy Gervais, Kilian Fatras, Yan Zhang,, Simon Lacoste-Julien

PDF

Open Access 1 Repo

TL;DR

PAPA is a novel method that combines the diversity of ensemble models with the efficiency of weight averaging, improving accuracy by leveraging a population of diverse neural networks.

Contribution

We introduce PAPA, a new approach that gradually averages weights of diverse models to bridge the gap between ensembling and weight averaging.

Findings

01

Increases accuracy by up to 1.9% on CIFAR-100.

02

Reduces performance gap between averaging and ensembling.

03

Effective across multiple datasets including CIFAR-10, CIFAR-100, and ImageNet.

Abstract

Ensemble methods combine the predictions of multiple models to improve performance, but they require significantly higher computation costs at inference time. To avoid these costs, multiple neural networks can be combined into one by averaging their weights. However, this usually performs significantly worse than ensembling. Weight averaging is only beneficial when different enough to benefit from combining them, but similar enough to average well. Based on this idea, we propose PopulAtion Parameter Averaging (PAPA): a method that combines the generality of ensembling with the efficiency of weight averaging. PAPA leverages a population of diverse models (trained on different data orders, augmentations, and regularizations) while slowly pushing the weights of the networks toward the population average of the weights. We also propose PAPA variants (PAPA-all, and PAPA-2) that average…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samsungsailmontreal/papa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning