Deep Convolutional Neural Network Inference with Floating-point Weights   and Fixed-point Activations

Liangzhen Lai; Naveen Suda; Vikas Chandra

arXiv:1703.03073·cs.LG·March 10, 2017·85 cites

Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations

Liangzhen Lai, Naveen Suda, Vikas Chandra

PDF

Open Access

TL;DR

This paper proposes a hybrid floating-point and fixed-point representation scheme for CNN inference, demonstrating improved efficiency and reduced hardware power consumption on large-scale networks.

Contribution

It introduces using floating-point for weights and fixed-point for activations, enhancing efficiency and hardware design for CNN inference.

Findings

01

Reduces weight storage by up to 36%.

02

Decreases hardware multiplier power consumption by up to 50%.

03

Effective on large-scale CNNs like AlexNet and VGG-16.

Abstract

Deep convolutional neural network (CNN) inference requires significant amount of memory and computation, which limits its deployment on embedded devices. To alleviate these problems to some extent, prior research utilize low precision fixed-point numbers to represent the CNN weights and activations. However, the minimum required data precision of fixed-point weights varies across different networks and also across different layers of the same network. In this work, we propose using floating-point numbers for representing the weights and fixed-point numbers for representing the activations. We show that using floating-point representation for weights is more efficient than fixed-point representation for the same bit-width and demonstrate it on popular large-scale CNNs such as AlexNet, SqueezeNet, GoogLeNet and VGG-16. We also show that such a representation scheme enables compact…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Advanced Memory and Neural Computing

MethodsResidual Connection · Convolution · Average Pooling · Fire Module · Local Response Normalization · Auxiliary Classifier · Inception Module · Global Average Pooling · Grouped Convolution · 1x1 Convolution