AP: Selective Activation for De-sparsifying Pruned Neural Networks

Shiyu Liu; Rohan Ghosh; Dylan Tan; Mehul Motani

arXiv:2212.06145·cs.LG·December 14, 2022

AP: Selective Activation for De-sparsifying Pruned Neural Networks

Shiyu Liu, Rohan Ghosh, Dylan Tan, Mehul Motani

PDF

Open Access

TL;DR

This paper introduces Activating-while-Pruning (AP), a method that enhances pruned neural networks by selectively activating nodes to reduce sparsity and improve performance across various architectures and datasets.

Contribution

AP is a novel method that works with existing pruning techniques to reduce dynamic dead neuron rate and boost neural network performance.

Findings

01

AP improves pruning performance by 3-4% on CIFAR datasets.

02

AP yields 2-3% accuracy gains on ImageNet and vision transformers.

03

Extensive experiments validate AP's effectiveness across multiple models and datasets.

Abstract

The rectified linear unit (ReLU) is a highly successful activation function in neural networks as it allows networks to easily obtain sparse representations, which reduces overfitting in overparameterized networks. However, in network pruning, we find that the sparsity introduced by ReLU, which we quantify by a term called dynamic dead neuron rate (DNR), is not beneficial for the pruned network. Interestingly, the more the network is pruned, the smaller the dynamic DNR becomes during optimization. This motivates us to propose a method to explicitly reduce the dynamic DNR for the pruned network, i.e., de-sparsify the network. We refer to our method as Activating-while-Pruning (AP). We note that AP does not function as a stand-alone method, as it does not evaluate the importance of weights. Instead, it works in tandem with existing pruning methods and aims to improve their performance by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Neural Networks and Applications

MethodsPruning · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Residual Connection · Bottleneck Residual Block · Residual Block · Kaiming Initialization · Average Pooling · Convolution