Soft Threshold Weight Reparameterization for Learnable Sparsity

Aditya Kusupati; Vivek Ramanujan; Raghav Somani; Mitchell Wortsman,; Prateek Jain; Sham Kakade; Ali Farhadi

arXiv:2002.03231·cs.LG·June 24, 2020·82 cites

Soft Threshold Weight Reparameterization for Learnable Sparsity

Aditya Kusupati, Vivek Ramanujan, Raghav Somani, Mitchell Wortsman,, Prateek Jain, Sham Kakade, Ali Farhadi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Soft Threshold Reparameterization (STR), a novel method that learns non-uniform sparsity budgets in neural networks, significantly improving accuracy and reducing FLOPs, especially in ultra sparse regimes.

Contribution

STR is a new soft-threshold operator-based approach that learns layer-wise sparsity thresholds, outperforming heuristic methods in neural network pruning.

Findings

01

Achieves state-of-the-art accuracy for unstructured sparsity in CNNs.

02

Reduces FLOPs by up to 50% through learned non-uniform sparsity budgets.

03

Boosts accuracy by up to 10% in ultra sparse (99%) regimes.

Abstract

Sparsity in Deep Neural Networks (DNNs) is studied extensively with the focus of maximizing prediction accuracy given an overall parameter budget. Existing methods rely on uniform or heuristic non-uniform sparsity budgets which have sub-optimal layer-wise parameter allocation resulting in a) lower prediction accuracy or b) higher inference cost (FLOPs). This work proposes Soft Threshold Reparameterization (STR), a novel use of the soft-threshold operator on DNN weights. STR smoothly induces sparsity while learning pruning thresholds thereby obtaining a non-uniform sparsity budget. Our method achieves state-of-the-art accuracy for unstructured sparsity in CNNs (ResNet50 and MobileNetV1 on ImageNet-1K), and, additionally, learns non-uniform budgets that empirically reduce the FLOPs by up to 50%. Notably, STR boosts the accuracy over existing results by up to 10% in the ultra sparse (99%)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RAIVNLab/STR
pytorchOfficial

Videos

Soft Threshold Weight Reparameterization for Learnable Sparsity· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Infrastructure Maintenance and Monitoring

MethodsPruning · Depthwise Convolution · Pointwise Convolution · Average Pooling · Global Average Pooling · Depthwise Separable Convolution · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Dense Connections