Intra-layer Nonuniform Quantization for Deep Convolutional Neural   Network

Fangxuan Sun; Jun Lin; Zhongfeng Wang

arXiv:1607.02720·cs.CV·August 9, 2016·5 cites

Intra-layer Nonuniform Quantization for Deep Convolutional Neural Network

Fangxuan Sun, Jun Lin, Zhongfeng Wang

PDF

Open Access

TL;DR

This paper introduces two nonuniform quantization schemes for deep CNNs that significantly reduce memory usage while maintaining or improving accuracy, facilitating more efficient hardware and software implementations.

Contribution

It proposes equal distance and K-means clustering nonuniform quantization schemes that cut memory requirements by about 50% without sacrificing accuracy.

Findings

01

Both schemes halve memory storage for VGG-16 and AlexNet.

02

KNQ offers better accuracy tradeoffs than ENQ.

03

Achieves comparable or better classification accuracy with reduced memory.

Abstract

Deep convolutional neural network (DCNN) has achieved remarkable performance on object detection and speech recognition in recent years. However, the excellent performance of a DCNN incurs high computational complexity and large memory requirement. In this paper, an equal distance nonuniform quantization (ENQ) scheme and a K-means clustering nonuniform quantization (KNQ) scheme are proposed to reduce the required memory storage when low complexity hardware or software implementations are considered. For the VGG-16 and the AlexNet, the proposed nonuniform quantization schemes reduce the number of required memory storage by approximately 50\% while achieving almost the same or even better classification accuracy compared to the state-of-the-art quantization method. Compared to the ENQ scheme, the proposed KNQ scheme provides a better tradeoff when higher accuracy is required.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Advanced Data Compression Techniques

MethodsDiffusion-Convolutional Neural Networks · 1x1 Convolution · Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax