Structured Model Pruning of Convolutional Networks on Tensor Processing   Units

Kongtao Chen; Ken Franko; Ruoxin Sang

arXiv:2107.04191·cs.LG·July 22, 2021·21 cites

Structured Model Pruning of Convolutional Networks on Tensor Processing Units

Kongtao Chen, Ken Franko, Ruoxin Sang

PDF

Open Access

TL;DR

This paper evaluates structured model pruning techniques on convolutional networks, specifically VGG-16, demonstrating significant improvements in memory and speed on TPUs without accuracy loss, using a new TensorFlow2 pruning library.

Contribution

It introduces a practical pruning library for TensorFlow2 and provides empirical analysis of pruning effects on TPUs across datasets.

Findings

01

Structured pruning improves memory and speed on TPUs.

02

Pruning maintains accuracy on small datasets like CIFAR-10.

03

The library enables in-place model modifications for efficient pruning.

Abstract

The deployment of convolutional neural networks is often hindered by high computational and storage requirements. Structured model pruning is a promising approach to alleviate these requirements. Using the VGG-16 model as an example, we measure the accuracy-efficiency trade-off for various structured model pruning methods and datasets (CIFAR-10 and ImageNet) on Tensor Processing Units (TPUs). To measure the actual performance of models, we develop a structured model pruning library for TensorFlow2 to modify models in place (instead of adding mask layers). We show that structured model pruning can significantly improve model memory usage and speed on TPUs without losing accuracy, especially for small datasets (e.g., CIFAR-10).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Computational Physics and Python Applications · Parallel Computing and Optimization Techniques

MethodsPruning