Adaptive Binary-Ternary Quantization

Ryan Razani; Gr\'egoire Morin; Vahid Partovi Nia; and Eyy\"ub Sari

arXiv:1909.12205·cs.LG·September 15, 2021

Adaptive Binary-Ternary Quantization

Ryan Razani, Gr\'egoire Morin, Vahid Partovi Nia, and Eyy\"ub Sari

PDF

Open Access

TL;DR

This paper introduces Smart Quantization, an adaptive method combining binary and ternary quantization in neural networks, enabling single training with high accuracy on resource-constrained devices.

Contribution

The paper proposes a novel adaptive quantization method that adjusts quantization depth during training, reducing the need for multiple training runs.

Findings

01

Successfully adapts quantization depth during training

02

Maintains high accuracy on MNIST and CIFAR10

03

Reduces training complexity for quantized models

Abstract

Neural network models are resource hungry. It is difficult to deploy such deep networks on devices with limited resources, like smart wearables, cellphones, drones, and autonomous vehicles. Low bit quantization such as binary and ternary quantization is a common approach to alleviate this resource requirements. Ternary quantization provides a more flexible model and outperforms binary quantization in terms of accuracy, however doubles the memory footprint and increases the computational cost. Contrary to these approaches, mixed quantized models allow a trade-off between accuracy and memory footprint. In such models, quantization depth is often chosen manually, or is tuned using a separate optimization routine. The latter requires training a quantized network multiple times. Here, we propose an adaptive combination of binary and ternary quantization, namely Smart Quantization (SQ), in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning