UPSCALE: Unconstrained Channel Pruning

Alvin Wan; Hanxiang Hao; Kaushik Patnaik; Yueyang Xu; Omer Hadad,; David G\"uera; Zhile Ren; Qi Shan

arXiv:2307.08771·cs.CV·July 19, 2023

UPSCALE: Unconstrained Channel Pruning

Alvin Wan, Hanxiang Hao, Kaushik Patnaik, Yueyang Xu, Omer Hadad,, David G\"uera, Zhile Ren, Qi Shan

PDF

Open Access 1 Repo 1 Video

TL;DR

UPSCALE introduces a novel channel reordering method that reduces inference latency and improves accuracy in pruned neural networks by removing constraints and optimizing channel order at export time.

Contribution

It proposes a generic algorithm to prune models without constraints, enhancing accuracy and speed, applicable to any pruning pattern.

Findings

01

Increases ImageNet accuracy by 2.1 points on average

02

Improves inference speed by up to 2x

03

Beneficial across multiple architectures like DenseNet, EfficientNetV2, ResNet

Abstract

As neural networks grow in size and complexity, inference speeds decline. To combat this, one of the most effective compression techniques -- channel pruning -- removes channels from weights. However, for multi-branch segments of a model, channel removal can introduce inference-time memory copies. In turn, these copies increase inference latency -- so much so that the pruned model can be slower than the unpruned model. As a workaround, pruners conventionally constrain certain channels to be pruned together. This fully eliminates memory copies but, as we show, significantly impairs accuracy. We now have a dilemma: Remove constraints but increase latency, or add constraints and impair accuracy. In response, our insight is to reorder channels at export time, (1) reducing latency by reducing memory copies and (2) improving accuracy by removing constraints. Using this insight, we design a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

apple/ml-upscale
pytorchOfficial

Videos

UPSCALE: Unconstrained Channel Pruning· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsDepthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · Concatenated Skip Connection · Batch Normalization · 1x1 Convolution · Dense Block · Max Pooling · Residual Connection · Residual Block