TL;DR
This paper introduces an accelerator-aware pruning method for CNNs that improves hardware efficiency and performance by aligning pruning with accelerator architecture constraints, applicable to various network types.
Contribution
It proposes a novel pruning scheme that considers accelerator architecture, maintaining high pruning ratios while enhancing efficiency and reducing logic complexity.
Findings
Doubling of accelerator performance with the proposed pruning scheme
High pruning ratios achieved on diverse networks including AlexNet, VGG16, ResNet, MobileNet
Reduced logic complexity of sparse accelerators
Abstract
Convolutional neural networks have shown tremendous performance capabilities in computer vision tasks, but their excessive amounts of weight storage and arithmetic operations prevent them from being adopted in embedded environments. One of the solutions involves pruning, where certain unimportant weights are forced to have a value of zero. Many pruning schemes have been proposed, but these have mainly focused on the number of pruned weights. Previous pruning schemes scarcely considered ASIC or FPGA accelerator architectures. When these pruned networks are run on accelerators, the lack of consideration of the architecture causes some inefficiency problems, including internal buffer misalignments and load imbalances. This paper proposes a new pruning scheme that reflects accelerator architectures. In the proposed scheme, pruning is performed so that the same number of weights remain for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning · Average Pooling · Global Average Pooling · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Kaiming Initialization · Residual Connection · Convolution · Residual Block
