Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Xiaohan Ding; Guiguang Ding; Xiangxin Zhou; Yuchen Guo; Jungong Han,; Ji Liu

arXiv:1909.12778·cs.LG·October 28, 2019·125 cites

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Xiaohan Ding, Guiguang Ding, Xiangxin Zhou, Yuchen Guo, Jungong Han,, Ji Liu

PDF

Open Access 4 Repos

TL;DR

This paper introduces a global sparse momentum SGD method for pruning deep neural networks, enabling automatic, end-to-end model compression without extensive manual tuning or post-pruning retraining.

Contribution

The proposed method achieves automatic layer-wise sparsity, simplifies pruning process, and improves the discovery of effective subnetworks compared to prior techniques.

Findings

01

Automatic global sparsity ratios for all layers.

02

No need for post-pruning retraining.

03

Better identification of winning subnetworks.

Abstract

Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices. DNN pruning is an approach for deep model compression, which aims at eliminating some parameters with tolerable performance degradation. In this paper, we propose a novel momentum-SGD-based optimization method to reduce the network complexity by on-the-fly pruning. Concretely, given a global compression ratio, we categorize all the parameters into two parts at each training iteration which are updated using different rules. In this way, we gradually zero out the redundant parameters, as we update them using only the ordinary weight decay but no gradients derived from the objective function. As a departure from prior methods that require heavy human works to tune the layer-wise sparsity ratios, prune by solving complicated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Neural Networks and Applications

MethodsPruning · Weight Decay