An Overview of Datatype Quantization Techniques for Convolutional Neural   Networks

Ali Athar

arXiv:1808.07530·cs.NE·August 24, 2018

An Overview of Datatype Quantization Techniques for Convolutional Neural Networks

Ali Athar

PDF

Open Access

TL;DR

This paper reviews various quantization techniques for CNNs that reduce hardware complexity and power consumption, enabling their deployment on low-power devices without significantly sacrificing performance.

Contribution

It provides a comprehensive overview and comparison of different quantization methods for CNNs, highlighting their advantages and limitations.

Findings

01

Quantization reduces hardware requirements and power consumption.

02

Certain techniques maintain CNN accuracy after quantization.

03

The paper discusses trade-offs between quantization levels and performance.

Abstract

Convolutional Neural Networks (CNNs) are becoming increasingly popular due to their superior performance in the domain of computer vision, in applications such as objection detection and recognition. However, they demand complex, power-consuming hardware which makes them unsuitable for implementation on low-power mobile and embedded devices. In this paper, a description and comparison of various techniques is presented which aim to mitigate this problem. This is primarily achieved by quantizing the floating-point weights and activations to reduce the hardware requirements, and adapting the training and inference algorithms to maintain the network's performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Advanced Image and Video Retrieval Techniques