Unifying and Merging Well-trained Deep Neural Networks for Inference   Stage

Yi-Min Chou; Yi-Ming Chan; Jia-Hong Lee; Chih-Yi Chiu; Chu-Song Chen

arXiv:1805.04980·cs.CV·May 15, 2018·6 cites

Unifying and Merging Well-trained Deep Neural Networks for Inference Stage

Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to merge two well-trained neural networks with different architectures into a single, efficient model for inference, reducing training time and resource usage.

Contribution

It presents a novel layer alignment and weight sharing technique to merge networks, enabling multi-task inference with less training overhead.

Findings

01

Merged models perform well on multiple tasks.

02

Significant reduction in training time and resource consumption.

03

Effective for different architectures handling various tasks.

Abstract

We propose a novel method to merge convolutional neural-nets for the inference stage. Given two well-trained networks that may have different architectures that handle different tasks, our method aligns the layers of the original networks and merges them into a unified model by sharing the representative codes of weights. The shared weights are further re-trained to fine-tune the performance of the merged model. The proposed method effectively produces a compact model that may run original tasks simultaneously on resource-limited devices. As it preserves the general architectures and leverages the co-used weights of well-trained networks, a substantial training overhead can be reduced to shorten the system development time. Experimental results demonstrate a satisfactory performance and validate the effectiveness of the method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ivclab/NeuralMerger
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning