Merging of neural networks
Martin Pa\v{s}en, Vladim\'ir Bo\v{z}a

TL;DR
This paper introduces a method for merging two neural networks trained with different initializations into a single network, improving performance and potentially serving as a final step after multiple training seeds.
Contribution
The authors present a simple channel selection scheme for merging neural networks, demonstrating improved performance over extended single training.
Findings
Merging networks can outperform longer single training.
The merging process is simple and effective.
The method is publicly available for use.
Abstract
We propose a simple scheme for merging two neural networks trained with different starting initialization into a single one with the same size as the original ones. We do this by carefully selecting channels from each input network. Our procedure might be used as a finalization step after one tries multiple starting seeds to avoid an unlucky one. We also show that training two networks and merging them leads to better performance than training a single network for an extended period of time. Availability: https://github.com/fmfi-compbio/neural-network-merging
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Algorithms
