Computation Error Analysis of Block Floating Point Arithmetic Oriented   Convolution Neural Network Accelerator Design

Zhourui Song; Zhenyu Liu; Dongsheng Wang

arXiv:1709.07776·cs.LG·November 27, 2017·6 cites

Computation Error Analysis of Block Floating Point Arithmetic Oriented Convolution Neural Network Accelerator Design

Zhourui Song, Zhenyu Liu, Dongsheng Wang

PDF

Open Access

TL;DR

This paper investigates the impact of block floating point arithmetic on CNN accuracy and efficiency, demonstrating that 8-bit mantissa BFP can maintain high accuracy with minimal loss and providing theoretical error bounds for CNN accelerator design.

Contribution

It verifies the effects of BFP word width on CNN performance without retraining and develops a theoretical noise-to-signal ratio bound for BFP-based CNN accelerators.

Findings

01

8-bit mantissa BFP causes less than 0.3% accuracy loss

02

Theoretical NSR upper bound guides BFP CNN design

03

BFP reduces hardware cost and data traffic

Abstract

The heavy burdens of computation and off-chip traffic impede deploying the large scale convolution neural network on embedded platforms. As CNN is attributed to the strong endurance to computation errors, employing block floating point (BFP) arithmetics in CNN accelerators could save the hardware cost and data traffics efficiently, while maintaining the classification accuracy. In this paper, we verify the effects of word width definitions in BFP to the CNN performance without retraining. Several typical CNN models, including VGG16, ResNet-18, ResNet-50 and GoogLeNet, were tested in this paper. Experiments revealed that 8-bit mantissa, including sign bit, in BFP representation merely induced less than 0.3% accuracy loss. In addition, we investigate the computational errors in theory and develop the noise-to-signal ratio (NSR) upper bound, which provides the promising guidance for BFP…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Advanced Neural Network Applications · Neural Networks and Applications

Methods1x1 Convolution · Average Pooling · Local Response Normalization · Auxiliary Classifier · Inception Module · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax