Fixed Point Quantization of Deep Convolutional Networks

Darryl D. Lin; Sachin S. Talathi; V. Sreekanth Annapureddy

arXiv:1511.06393·cs.LG·June 3, 2016·608 cites

Fixed Point Quantization of Deep Convolutional Networks

Darryl D. Lin, Sachin S. Talathi, V. Sreekanth Annapureddy

PDF

Open Access

TL;DR

This paper introduces an optimized fixed point quantization method for deep convolutional networks, reducing model size by over 20% without accuracy loss and achieving state-of-the-art fixed point performance on CIFAR-10.

Contribution

It proposes a novel quantizer design and bit-width optimization approach for fixed point DCNs, improving efficiency and accuracy.

Findings

01

Over 20% reduction in model size with no accuracy loss

02

Achieved a new fixed point error rate of 6.78% on CIFAR-10

03

Fine-tuning enhances fixed point model accuracy beyond floating point models

Abstract

In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging