AP: Selective Activation for De-sparsifying Pruned Neural Networks
Shiyu Liu, Rohan Ghosh, Dylan Tan, Mehul Motani

TL;DR
This paper introduces Activating-while-Pruning (AP), a method that enhances pruned neural networks by selectively activating nodes to reduce sparsity and improve performance across various architectures and datasets.
Contribution
AP is a novel method that works with existing pruning techniques to reduce dynamic dead neuron rate and boost neural network performance.
Findings
AP improves pruning performance by 3-4% on CIFAR datasets.
AP yields 2-3% accuracy gains on ImageNet and vision transformers.
Extensive experiments validate AP's effectiveness across multiple models and datasets.
Abstract
The rectified linear unit (ReLU) is a highly successful activation function in neural networks as it allows networks to easily obtain sparse representations, which reduces overfitting in overparameterized networks. However, in network pruning, we find that the sparsity introduced by ReLU, which we quantify by a term called dynamic dead neuron rate (DNR), is not beneficial for the pruned network. Interestingly, the more the network is pruned, the smaller the dynamic DNR becomes during optimization. This motivates us to propose a method to explicitly reduce the dynamic DNR for the pruned network, i.e., de-sparsify the network. We refer to our method as Activating-while-Pruning (AP). We note that AP does not function as a stand-alone method, as it does not evaluate the importance of weights. Instead, it works in tandem with existing pruning methods and aims to improve their performance by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Neural Networks and Applications
MethodsPruning · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Residual Connection · Bottleneck Residual Block · Residual Block · Kaiming Initialization · Average Pooling · Convolution
