Structured Model Pruning of Convolutional Networks on Tensor Processing Units
Kongtao Chen, Ken Franko, Ruoxin Sang

TL;DR
This paper evaluates structured model pruning techniques on convolutional networks, specifically VGG-16, demonstrating significant improvements in memory and speed on TPUs without accuracy loss, using a new TensorFlow2 pruning library.
Contribution
It introduces a practical pruning library for TensorFlow2 and provides empirical analysis of pruning effects on TPUs across datasets.
Findings
Structured pruning improves memory and speed on TPUs.
Pruning maintains accuracy on small datasets like CIFAR-10.
The library enables in-place model modifications for efficient pruning.
Abstract
The deployment of convolutional neural networks is often hindered by high computational and storage requirements. Structured model pruning is a promising approach to alleviate these requirements. Using the VGG-16 model as an example, we measure the accuracy-efficiency trade-off for various structured model pruning methods and datasets (CIFAR-10 and ImageNet) on Tensor Processing Units (TPUs). To measure the actual performance of models, we develop a structured model pruning library for TensorFlow2 to modify models in place (instead of adding mask layers). We show that structured model pruning can significantly improve model memory usage and speed on TPUs without losing accuracy, especially for small datasets (e.g., CIFAR-10).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Computational Physics and Python Applications · Parallel Computing and Optimization Techniques
MethodsPruning
