Explore the Knowledge contained in Network Weights to Obtain Sparse   Neural Networks

Mengqiao Han; Xiabi Liu; Zhaoyang Hai; Zhengwen Li

arXiv:2103.15590·cs.LG·April 28, 2021

Explore the Knowledge contained in Network Weights to Obtain Sparse Neural Networks

Mengqiao Han, Xiabi Liu, Zhaoyang Hai, Zhengwen Li

PDF

Open Access

TL;DR

This paper introduces a novel method using a switcher neural network to automatically learn sparse neural network structures by exploring the knowledge in the weights, improving efficiency and performance.

Contribution

A new approach employing a switcher neural network to optimize neural network sparsity by leveraging weight information, learned alternately with the task network.

Findings

01

Achieves sparse, well-performing fully connected layers.

02

Effective across various network architectures and datasets.

03

Stable convergence to optimal sparse structures.

Abstract

Sparse neural networks are important for achieving better generalization and enhancing computation efficiency. This paper proposes a novel learning approach to obtain sparse fully connected layers in neural networks (NNs) automatically. We design a switcher neural network (SNN) to optimize the structure of the task neural network (TNN). The SNN takes the weights of the TNN as the inputs and its outputs are used to switch the connections of TNN. In this way, the knowledge contained in the weights of TNN is explored to determine the importance of each connection and the structure of TNN consequently. The SNN and TNN are learned alternately with stochastic gradient descent (SGD) optimization, targeting at a common objective. After learning, we achieve the optimal structure and the optimal parameters of the TNN simultaneously. In order to evaluate the proposed approach, we conduct image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning