Cross-Channel Intragroup Sparsity Neural Network
Zhilin Yu, Chao Wang, Xin Wang, Qing Wu, Yong Zhao, Xundong Wu

TL;DR
This paper introduces a novel cross-channel intragroup sparsity structure and training algorithm that improve inference efficiency of pruned neural networks without sacrificing accuracy, addressing limitations of existing pruning methods.
Contribution
The work proposes a new CCI-Sparsity structure and training method that outperform prior pruning techniques in inference efficiency while maintaining high model accuracy.
Findings
CCI-Sparsity outperforms prior pruning methods in inference speed.
The proposed method maintains high accuracy after pruning.
Experimental results show significant efficiency gains with minimal accuracy loss.
Abstract
Modern deep neural networks rely on overparameterization to achieve state-of-the-art generalization. But overparameterized models are computationally expensive. Network pruning is often employed to obtain less demanding models for deployment. Fine-grained pruning removes individual weights in parameter tensors and can achieve a high model compression ratio with little accuracy degradation. However, it introduces irregularity into the computing dataflow and often does not yield improved model inference efficiency in practice. Coarse-grained model pruning, while realizing satisfactory inference speedup through removal of network weights in groups, e.g. an entire filter, often lead to significant accuracy degradation. This work introduces the cross-channel intragroup (CCI) sparsity structure, which can prevent the inference inefficiency of fine-grained pruning while maintaining outstanding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Advanced Neural Network Applications · Neural Networks and Applications
MethodsPruning
