To Filter Prune, or to Layer Prune, That Is The Question

Sara Elkerdawy; Mostafa Elhoushi; Abhineet Singh; Hong Zhang and; Nilanjan Ray

arXiv:2007.05667·cs.CV·November 10, 2020

To Filter Prune, or to Layer Prune, That Is The Question

Sara Elkerdawy, Mostafa Elhoushi, Abhineet Singh, Hong Zhang and, Nilanjan Ray

PDF

1 Repo

TL;DR

This paper introduces LayerPrune, a layer pruning framework that achieves higher latency reduction than filter pruning, offering better accuracy-speed trade-offs and outperforming some handcrafted architectures on ImageNet.

Contribution

The paper proposes a novel layer pruning method that focuses on latency reduction, unlike traditional filter pruning that mainly considers FLOPs, and demonstrates its effectiveness across multiple networks and hardware.

Findings

01

LayerPrune achieves higher latency reduction than filter pruning.

02

LayerPrune outperforms handcrafted architectures on ImageNet.

03

Pruning layers allows for more flexible latency reduction than pruning filters.

Abstract

Recent advances in pruning of neural networks have made it possible to remove a large number of filters or weights without any perceptible drop in accuracy. The number of parameters and that of FLOPs are usually the reported metrics to measure the quality of the pruned models. However, the gain in speed for these pruned models is often overlooked in the literature due to the complex nature of latency measurements. In this paper, we show the limitation of filter pruning methods in terms of latency reduction and propose LayerPrune framework. LayerPrune presents a set of layer pruning methods based on different criteria that achieve higher latency reduction than filter pruning methods on similar accuracy. The advantage of layer pruning over filter pruning in terms of latency reduction is a result of the fact that the former is not constrained by the original model's depth and thus allows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

selkerdawy/filter-vs-layer-pruning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Pointwise Convolution · Depthwise Convolution · Max Pooling · Sigmoid Activation · 1x1 Convolution · Average Pooling · Depthwise Separable Convolution · Batch Normalization