A Closer Look at Structured Pruning for Neural Network Compression

Elliot J. Crowley; Jack Turner; Amos Storkey; Michael O'Boyle

arXiv:1810.04622·stat.ML·June 10, 2019·22 cites

A Closer Look at Structured Pruning for Neural Network Compression

Elliot J. Crowley, Jack Turner, Amos Storkey, Michael O'Boyle

PDF

Open Access 2 Repos

TL;DR

This paper critically examines structured pruning in neural networks, revealing that training smaller, pruned architectures from scratch often outperforms pruned-and-tuned networks, and that these architectures are faster and scalable.

Contribution

It demonstrates that training reduced architectures from scratch surpasses pruned networks and introduces scalable architectures derived from pruning.

Findings

01

Reduced networks outperform pruned networks.

02

Training architectures from scratch is more effective.

03

Reduced networks are significantly faster in inference.

Abstract

Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network. However, the efficacy of structured pruning has largely evaded scrutiny. In this paper, we examine ResNets and DenseNets obtained through structured pruning-and-tuning and make two interesting observations: (i) reduced networks---smaller versions of the original network trained from scratch---consistently outperform pruned networks; (ii) if one takes the architecture of a pruned network and then trains it from scratch it is significantly more competitive. Furthermore, these architectures are easy to approximate: we can prune once and obtain a family of new, scalable network architectures that can simply be trained from scratch. Finally, we compare the inference speed of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning

MethodsPruning · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings