Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes
Marcelo Gennari do Nascimento, Theo W. Costain, Victor Adrian, Prisacariu

TL;DR
This paper introduces a method to optimize non-uniform quantization schemes for neural networks by using multi-task Gaussian processes to efficiently search for layer-wise bit configurations, achieving memory savings with minimal accuracy loss.
Contribution
It presents a novel hyperparameter search approach for neural network quantization using multi-task Gaussian processes, enabling layer-wise bit optimization for improved memory efficiency.
Findings
Lower precision in last layers maintains accuracy.
Memory savings achieved with minimal accuracy loss.
Effective on CIFAR10 and ImageNet datasets.
Abstract
We propose a novel method for neural network quantization that casts the neural architecture search problem as one of hyperparameter search to find non-uniform bit distributions throughout the layers of a CNN. We perform the search assuming a Multi-Task Gaussian Processes prior, which splits the problem to multiple tasks, each corresponding to different number of training epochs, and explore the space by sampling those configurations that yield maximum information. We then show that with significantly lower precision in the last layers we achieve a minimal loss of accuracy with appreciable memory savings. We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet and GoogLeNet architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Advanced Neural Network Applications
Methods1x1 Convolution · Dense Connections · Average Pooling · Inception Module · Dropout · Batch Normalization · Global Average Pooling · Auxiliary Classifier · Softmax · Bottleneck Residual Block
