StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs
Tianyun Zhang, Shaokai Ye, Kaiqi Zhang, Xiaolong Ma, Ning Liu, Linfeng, Zhang, Jian Tang, Kaisheng Ma, Xue Lin, Makan Fardad, Yanzhi Wang

TL;DR
This paper introduces StructADMM, a systematic framework for structured weight pruning in DNNs, achieving high sparsity and GPU acceleration without accuracy loss, surpassing prior methods.
Contribution
Proposes a unified ADMM-based framework for various structured sparsity types, significantly improving pruning rate and GPU speedup while maintaining accuracy.
Findings
Achieves 2.58X and 3.65X speedup on GPUs without accuracy loss.
Reaches up to 8.52X speedup with moderate accuracy loss.
Demonstrates higher performance on ResNet, UCF101, and CIFAR-10.
Abstract
Weight pruning methods of DNNs have been demonstrated to achieve a good model pruning rate without loss of accuracy, thereby alleviating the significant computation/storage requirements of large-scale DNNs. Structured weight pruning methods have been proposed to overcome the limitation of irregular network structure and demonstrated actual GPU acceleration. However, in prior work the pruning rate (degree of sparsity) and GPU acceleration are limited (to less than 50%) when accuracy needs to be maintained. In this work,we overcome these limitations by proposing a unified, systematic framework of structured weight pruning for DNNs. It is a framework that can be used to induce different types of structured sparsity, such as filter-wise, channel-wise, and shape-wise sparsity, as well non-structured sparsity. The proposed framework incorporates stochastic gradient descent with ADMM, and can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
MethodsPruning · Average Pooling · Local Response Normalization · Grouped Convolution · Dropout · Alternating Direction Method of Multipliers · Dense Connections · Softmax · How do I speak to a person at Expedia?-/+/ · *Communicated@Fast*How Do I Communicate to Expedia?
