Accuracy to Throughput Trade-offs for Reduced Precision Neural Networks on Reconfigurable Logic
Jiang Su, Nicholas J. Fraser, Giulio Gambardella, Michaela, Blott, Gianluca Durelli, David B. Thomas, Philip Leong, Peter Y., K. Cheung

TL;DR
This paper explores the accuracy and throughput trade-offs in reduced precision neural networks on reconfigurable logic, proposing a quantization training strategy and analyzing hardware efficiency across different precisions and datasets.
Contribution
It introduces a quantization training method for reduced precision NNs and quantitatively links data representation with hardware efficiency in neural network inference.
Findings
2-bit and 4-bit fixed point parameters outperform 1-bit in hardware efficiency on small datasets.
4-bit precision offers the best trade-off for large-scale tasks like ImageNet.
32-bit floating point is more hardware efficient than 1-bit parameters for MNIST accuracy.
Abstract
Modern CNN are typically based on floating point linear algebra based implementations. Recently, reduced precision NN have been gaining popularity as they require significantly less memory and computational resources compared to floating point. This is particularly important in power constrained compute environments. However, in many cases a reduction in precision comes at a small cost to the accuracy of the resultant network. In this work, we investigate the accuracy-throughput trade-off for various parameter precision applied to different types of NN models. We firstly propose a quantization training strategy that allows reduced precision NN inference with a lower memory footprint and competitive model accuracy. Then, we quantitatively formulate the relationship between data representation and hardware efficiency. Our experiments finally provide insightful observation. For example,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
