FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding
Sai Qian Zhang, Bradley McDanel, H.T. Kung

TL;DR
FAST introduces a variable precision block floating point system with stochastic rounding for DNN training, significantly accelerating training time and reducing hardware resources while maintaining accuracy.
Contribution
The paper presents a novel FAST system that supports variable precision BFP for DNN training, enabling incremental precision increases and faster training.
Findings
Achieves 2-6× speedup in training time.
Maintains comparable validation accuracy to prior methods.
Reduces hardware resource usage during training.
Abstract
Block Floating Point (BFP) can efficiently support quantization for Deep Neural Network (DNN) training by providing a wide dynamic range via a shared exponent across a group of values. In this paper, we propose a Fast First, Accurate Second Training (FAST) system for DNNs, where the weights, activations, and gradients are represented in BFP. FAST supports matrix multiplication with variable precision BFP input operands, enabling incremental increases in DNN precision throughout training. By increasing the BFP precision across both training iterations and DNN layers, FAST can greatly shorten the training time while reducing overall hardware resource usage. Our FAST Multipler-Accumulator (fMAC) supports dot product computations under multiple BFP precisions. We validate our FAST system on multiple DNNs with different datasets, demonstrating a 2-6 speedup in training on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Numerical Methods and Algorithms · Model Reduction and Neural Networks
