Merging of neural networks

Martin Pa\v{s}en; Vladim\'ir Bo\v{z}a

arXiv:2204.09973·cs.LG·August 23, 2022

Merging of neural networks

Martin Pa\v{s}en, Vladim\'ir Bo\v{z}a

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method for merging two neural networks trained with different initializations into a single network, improving performance and potentially serving as a final step after multiple training seeds.

Contribution

The authors present a simple channel selection scheme for merging neural networks, demonstrating improved performance over extended single training.

Findings

01

Merging networks can outperform longer single training.

02

The merging process is simple and effective.

03

The method is publicly available for use.

Abstract

We propose a simple scheme for merging two neural networks trained with different starting initialization into a single one with the same size as the original ones. We do this by carefully selecting channels from each input network. Our procedure might be used as a finalization step after one tries multiple starting seeds to avoid an unlucky one. We also show that training two networks and merging them leads to better performance than training a single network for an extended period of time. Availability: https://github.com/fmfi-compbio/neural-network-merging

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fmfi-compbio/neural-network-merging
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Algorithms