Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks
Yochai Zur, Chaim Baskin, Evgenii Zheltonozhskii, Brian Chmiel, Itay, Evron, Alex M. Bronstein, Avi Mendelson

TL;DR
This paper explores optimizing filter-level heterogeneity in CNN compression through neural architecture search, focusing on quantization and pruning to meet computational budgets while maintaining accuracy.
Contribution
It formulates the problem of filter-level heterogeneity in CNN compression as a neural architecture search task, introducing a differentiable method for optimal configuration discovery.
Findings
Heterogeneous quantized networks show high variance, questioning their benefits.
Pruning improvements over homogeneous cases are possible but challenging with current methods.
The proposed search method can identify configurations that satisfy computational constraints.
Abstract
Recently, deep learning has become a de facto standard in machine learning with convolutional neural networks (CNNs) demonstrating spectacular success on a wide variety of tasks. However, CNNs are typically very demanding computationally at inference time. One of the ways to alleviate this burden on certain hardware platforms is quantization relying on the use of low-precision arithmetic representation for the weights and the activations. Another popular method is the pruning of the number of filters in each layer. While mainstream deep learning methods train the neural networks weights while keeping the network architecture fixed, the emerging neural architecture search (NAS) techniques make the latter also amenable to training. In this paper, we formulate optimal arithmetic bit length allocation and neural network pruning as a NAS problem, searching for the configurations satisfying a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Image and Signal Denoising Methods · Advanced Neural Network Applications
MethodsPruning · Sigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
