Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
Urs K\"oster, Tristan J. Webb, Xin Wang, Marcel Nassar, Arjun K., Bansal, William H. Constable, O\u{g}uz H. Elibol, Scott Gray, Stewart Hall,, Luke Hornof, Amir Khosrowshahi, Carey Kloss, Ruby J. Pai, Naveen Rao

TL;DR
Flexpoint is a new adaptive numerical format designed to replace 32-bit floating point in deep learning, enabling efficient training and inference with minimal accuracy loss across various neural network architectures.
Contribution
The paper introduces Flexpoint, a novel dynamic shared-exponent data format that supports deep neural network training without modifications, matching 32-bit performance in a 16-bit format.
Findings
Flexpoint closely matches 32-bit floating point in training accuracy.
Flexpoint requires no hyperparameter tuning.
Validated on AlexNet, ResNet, and GAN models.
Abstract
Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present the Flexpoint data format, aiming at a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications. Flexpoint tensors have a shared exponent that is dynamically adjusted to minimize overflows and maximize available dynamic range. We validate Flexpoint by training AlexNet, a deep residual network and a generative adversarial network, using a simulator implemented with the neon deep learning framework. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Tensor decomposition and applications · Parallel Computing and Optimization Techniques
Methods1x1 Convolution · Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax · How do I speak to a person at Expedia?-/+/
