Accuracy to Throughput Trade-offs for Reduced Precision Neural Networks   on Reconfigurable Logic

Jiang Su; Nicholas J. Fraser; Giulio Gambardella; Michaela; Blott; Gianluca Durelli; David B. Thomas; Philip Leong; Peter Y.; K. Cheung

arXiv:1807.10577·cs.CV·July 30, 2018

Accuracy to Throughput Trade-offs for Reduced Precision Neural Networks on Reconfigurable Logic

Jiang Su, Nicholas J. Fraser, Giulio Gambardella, Michaela, Blott, Gianluca Durelli, David B. Thomas, Philip Leong, Peter Y., K. Cheung

PDF

Open Access

TL;DR

This paper explores the accuracy and throughput trade-offs in reduced precision neural networks on reconfigurable logic, proposing a quantization training strategy and analyzing hardware efficiency across different precisions and datasets.

Contribution

It introduces a quantization training method for reduced precision NNs and quantitatively links data representation with hardware efficiency in neural network inference.

Findings

01

2-bit and 4-bit fixed point parameters outperform 1-bit in hardware efficiency on small datasets.

02

4-bit precision offers the best trade-off for large-scale tasks like ImageNet.

03

32-bit floating point is more hardware efficient than 1-bit parameters for MNIST accuracy.

Abstract

Modern CNN are typically based on floating point linear algebra based implementations. Recently, reduced precision NN have been gaining popularity as they require significantly less memory and computational resources compared to floating point. This is particularly important in power constrained compute environments. However, in many cases a reduction in precision comes at a small cost to the accuracy of the resultant network. In this work, we investigate the accuracy-throughput trade-off for various parameter precision applied to different types of NN models. We firstly propose a quantization training strategy that allows reduced precision NN inference with a lower memory footprint and competitive model accuracy. Then, we quantitatively formulate the relationship between data representation and hardware efficiency. Our experiments finally provide insightful observation. For example,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification