Mixed Precision Training With 8-bit Floating Point
Naveen Mellempudi, Sudarshan Srinivasan, Dipankar Das, Bharat Kaul

TL;DR
This paper introduces a novel method for training deep neural networks using 8-bit floating point precision, achieving high accuracy and efficiency improvements across multiple datasets and models.
Contribution
The paper presents a new approach for 8-bit floating point training, including loss scaling and stochastic rounding, enabling state-of-the-art accuracy with reduced precision.
Findings
Achieved state-of-the-art accuracy on ImageNet and WMT16 datasets.
Demonstrated effective training of various models like ResNet and Transformer at 8-bit precision.
Reported slightly higher validation accuracy than full precision baseline.
Abstract
Reduced precision computation for deep neural networks is one of the key areas addressing the widening compute gap driven by an exponential growth in model size. In recent years, deep learning training has largely migrated to 16-bit precision, with significant gains in performance and energy efficiency. However, attempts to train DNNs at 8-bit precision have met with significant challenges because of the higher precision and dynamic range requirements of back-propagation. In this paper, we propose a method to train deep neural networks using 8-bit floating point representation for weights, activations, errors, and gradients. In addition to reducing compute precision, we also reduced the precision requirements for the master copy of weights from 32-bit to 16-bit. We demonstrate state-of-the-art accuracy across multiple data sets (imagenet-1K, WMT16) and a broader set of workloads…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Mechanisms and Dynamics · Astronomical Observations and Instrumentation · Image and Object Detection Techniques
