Training Sparse Neural Network by Constraining Synaptic Weight on Unit   Lp Sphere

Weipeng Li; Xiaogang Yang; Chuanxiang Li; Ruitao Lu; Xueli Xie

arXiv:2103.16013·cs.LG·March 31, 2021

Training Sparse Neural Network by Constraining Synaptic Weight on Unit Lp Sphere

Weipeng Li, Xiaogang Yang, Chuanxiang Li, Ruitao Lu, Xueli Xie

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method for training sparse neural networks by constraining weights on a unit Lp-sphere, using a new gradient descent algorithm, with theoretical guarantees and practical topology evolution techniques.

Contribution

The paper proposes Lp-spherical gradient descent for constrained optimization, analyzes the effect of p on sparsity, and introduces semi-pruning for effective topology evolution.

Findings

01

LpSGD converges theoretically and empirically.

02

The expected sparsity can be predicted under gamma distribution assumptions.

03

Experimental results validate the effectiveness across benchmark datasets.

Abstract

Sparse deep neural networks have shown their advantages over dense models with fewer parameters and higher computational efficiency. Here we demonstrate constraining the synaptic weights on unit Lp-sphere enables the flexibly control of the sparsity with p and improves the generalization ability of neural networks. Firstly, to optimize the synaptic weights constrained on unit Lp-sphere, the parameter optimization algorithm, Lp-spherical gradient descent (LpSGD) is derived from the augmented Empirical Risk Minimization condition, which is theoretically proved to be convergent. To understand the mechanism of how p affects Hoyer's sparsity, the expectation of Hoyer's sparsity under the hypothesis of gamma distribution is given and the predictions are verified at various p under different conditions. In addition, the "semi-pruning" and threshold adaptation are designed for topology…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WilliamLiPro/LpSS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Advanced Memory and Neural Computing · Domain Adaptation and Few-Shot Learning