Accelerator-Aware Pruning for Convolutional Neural Networks

Hyeong-Ju Kang

arXiv:1804.09862·cs.NE·September 8, 2020

Accelerator-Aware Pruning for Convolutional Neural Networks

Hyeong-Ju Kang

PDF

3 Repos

TL;DR

This paper introduces an accelerator-aware pruning method for CNNs that improves hardware efficiency and performance by aligning pruning with accelerator architecture constraints, applicable to various network types.

Contribution

It proposes a novel pruning scheme that considers accelerator architecture, maintaining high pruning ratios while enhancing efficiency and reducing logic complexity.

Findings

01

Doubling of accelerator performance with the proposed pruning scheme

02

High pruning ratios achieved on diverse networks including AlexNet, VGG16, ResNet, MobileNet

03

Reduced logic complexity of sparse accelerators

Abstract

Convolutional neural networks have shown tremendous performance capabilities in computer vision tasks, but their excessive amounts of weight storage and arithmetic operations prevent them from being adopted in embedded environments. One of the solutions involves pruning, where certain unimportant weights are forced to have a value of zero. Many pruning schemes have been proposed, but these have mainly focused on the number of pruned weights. Previous pruning schemes scarcely considered ASIC or FPGA accelerator architectures. When these pruned networks are run on accelerators, the lack of consideration of the architecture causes some inefficiency problems, including internal buffer misalignments and load imbalances. This paper proposes a new pruning scheme that reflects accelerator architectures. In the proposed scheme, pruning is performed so that the same number of weights remain for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning · Average Pooling · Global Average Pooling · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Kaiming Initialization · Residual Connection · Convolution · Residual Block