Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction
Sara Elkerdawy, Mostafa Elhoushi, Hong Zhang, Nilanjan Ray

TL;DR
This paper introduces a self-supervised dynamic pruning method inspired by neuroscience, which predicts filter masks based on activations, enabling efficient FLOPs reduction with transparent hyperparameter tuning and decoupled loss functions.
Contribution
It proposes a novel self-supervised mask prediction approach for dynamic pruning that simplifies hyperparameter selection and decouples task and pruning losses.
Findings
Achieves comparable accuracy to SOTA with higher FLOPs reduction on CIFAR.
Attains lower accuracy drop with up to 13% more FLOPs reduction on ImageNet.
Demonstrates effectiveness across multiple architectures and datasets.
Abstract
Dynamic model pruning is a recent direction that allows for the inference of a different sub-network for each input sample during deployment. However, current dynamic methods rely on learning a continuous channel gating through regularization by inducing sparsity loss. This formulation introduces complexity in balancing different losses (e.g task loss, regularization loss). In addition, regularization based methods lack transparent tradeoff hyperparameter selection to realize a computational budget. Our contribution is two-fold: 1) decoupled task and pruning losses. 2) Simple hyperparameter selection that enables FLOPs reduction estimation before training. Inspired by the Hebbian theory in Neuroscience: "neurons that fire together wire together", we propose to predict a mask to process k filters in a layer based on the activation of its previous layer. We pose the problem as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Brain Tumor Detection and Classification
MethodsPruning · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Softmax · Batch Normalization · Residual Connection · Convolution · Average Pooling · Bottleneck Residual Block · Dropout
