On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks
Dang Nguyen, Trang Nguyen, Khai Nguyen, Dinh Phung, Hung, Bui, Nhat Ho

TL;DR
This paper introduces CLAFusion, a novel framework for fusing neural networks with different layer counts using cross-layer alignment, enhancing model accuracy and efficiency in various applications.
Contribution
We propose CLAFusion, a new method for fusing heterogeneous neural networks with different layer numbers via cross-layer alignment and dynamic programming.
Findings
CLAFusion improves residual network accuracy on CIFAR and Tiny-ImageNet datasets.
The method effectively balances layers before fusion, enabling better model compression.
CLAFusion benefits knowledge distillation in teacher-student setups.
Abstract
Layer-wise model fusion via optimal transport, named OTFusion, applies soft neuron association for unifying different pre-trained networks to save computational resources. While enjoying its success, OTFusion requires the input networks to have the same number of layers. To address this issue, we propose a novel model fusion framework, named CLAFusion, to fuse neural networks with a different number of layers, which we refer to as heterogeneous neural networks, via cross-layer alignment. The cross-layer alignment problem, which is an unbalanced assignment problem, can be solved efficiently using dynamic programming. Based on the cross-layer alignment, our framework balances the number of layers of neural networks before applying layer-wise model fusion. Our experiments indicate that CLAFusion, with an extra finetuning process, improves the accuracy of residual networks on the CIFAR10,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsKnowledge Distillation
