Dynamic Runtime Feature Map Pruning
Tailin Liang, Lei Wang, Shaobo Shi, John Glossner

TL;DR
This paper introduces a dynamic runtime feature map pruning method for deep neural networks, reducing bandwidth and computation by removing unnecessary feature maps during execution without significant accuracy loss.
Contribution
It proposes a novel dynamic pruning technique that adaptively removes feature maps at runtime, extending static pruning methods and achieving additional bandwidth savings.
Findings
10% of feature map execution can be pruned without accuracy loss
Further 5% reduction achieved by epsilon-based pruning with 1% accuracy loss
Networks with ReLU activation have high parameter sparsity suitable for static pruning
Abstract
High bandwidth requirements are an obstacle for accelerating the training and inference of deep neural networks. Most previous research focuses on reducing the size of kernel maps for inference. We analyze parameter sparsity of six popular convolutional neural networks - AlexNet, MobileNet, ResNet-50, SqueezeNet, TinyNet, and VGG16. Of the networks considered, those using ReLU (AlexNet, SqueezeNet, VGG16) contain a high percentage of 0-valued parameters and can be statically pruned. Networks with Non-ReLU activation functions in some cases may not contain any 0-valued parameters (ResNet-50, TinyNet). We also investigate runtime feature map usage and find that input feature maps comprise the majority of bandwidth requirements when depth-wise convolution and point-wise convolutions used. We introduce dynamic runtime pruning of feature maps and show that 10% of dynamic feature map…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Web Data Mining and Analysis · Software Testing and Debugging Techniques
MethodsPruning · Residual Connection · Average Pooling · Fire Module · Local Response Normalization · Global Average Pooling · Grouped Convolution · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout
