AdaptivFloat: A Floating-point based Data Type for Resilient Deep   Learning Inference

Thierry Tambe; En-Yu Yang; Zishen Wan; Yuntian Deng; Vijay Janapa; Reddi; Alexander Rush; David Brooks; Gu-Yeon Wei

arXiv:1909.13271·cs.LG·February 12, 2020·23 cites

AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference

Thierry Tambe, En-Yu Yang, Zishen Wan, Yuntian Deng, Vijay Janapa, Reddi, Alexander Rush, David Brooks, Gu-Yeon Wei

PDF

Open Access

TL;DR

AdaptivFloat is a novel floating-point inspired data type for deep learning inference that dynamically optimizes its dynamic range at a layer level, leading to improved accuracy at low bit widths and efficient hardware implementation.

Contribution

We introduce AdaptivFloat, a dynamic floating-point format that enhances low-precision neural network inference and hardware efficiency compared to existing quantization methods.

Findings

01

Outperforms block floating-point and other encodings at ≤8-bit precision.

02

Surpasses FP32 baseline in BLEU score and WER at low bit widths.

03

Hardware implementation shows reduced energy and area consumption.

Abstract

Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to perform poorly at very low word sizes as their shrinking dynamic ranges cannot adequately capture the wide data distributions commonly seen in sequence transduction models. We present AdaptivFloat, a floating-point inspired number representation format for deep learning that dynamically maximizes and optimally clips its available dynamic range, at a layer granularity, in order to create faithful encoding of neural network parameters. AdaptivFloat consistently produces higher inference accuracies compared to block floating-point, uniform, IEEE-like float or posit encodings at very low precision ( $\leq$ 8-bit) across a diverse set of state-of-the-art neural network topologies. And notably, AdaptivFloat is seen surpassing baseline FP32 performance by up to +0.3 in BLEU score and -0.75 in word error…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Numerical Methods and Algorithms