Training Deep Neural Networks with 8-bit Floating Point Numbers
Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, Kailash, Gopalakrishnan

TL;DR
This paper demonstrates the first successful training of deep neural networks using only 8-bit floating point numbers, maintaining accuracy while significantly improving hardware efficiency.
Contribution
It introduces novel techniques for 8-bit floating point training, enabling high-accuracy DNN training with reduced precision, which was previously considered challenging.
Findings
Achieved accurate training of DNNs with 8-bit floating point numbers.
Reduced addition precision from 32-bit to 16-bit without loss of accuracy.
Potential for 2-4x hardware throughput improvements.
Abstract
The state-of-the-art hardware platforms for training Deep Neural Networks (DNNs) are moving from traditional single precision (32-bit) computations towards 16 bits of precision -- in large part due to the high energy efficiency and smaller bit storage associated with using reduced-precision representations. However, unlike inference, training with numbers represented with less than 16 bits has been challenging due to the need to maintain fidelity of the gradient computations during back-propagation. Here we demonstrate, for the first time, the successful training of DNNs using 8-bit floating point numbers while fully maintaining the accuracy on a spectrum of Deep Learning models and datasets. In addition to reducing the data and computation precision to 8 bits, we also successfully reduce the arithmetic precision for additions (used in partial product accumulation and weight updates)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Advanced Neural Network Applications
