Vectorization of Multibyte Floating Point Data Formats

Andrew Anderson; David Gregg

arXiv:1601.07789·cs.MS·January 31, 2017

Vectorization of Multibyte Floating Point Data Formats

Andrew Anderson, David Gregg

PDF

TL;DR

This paper introduces a flexible reduced-precision floating point scheme that can be efficiently accelerated using existing hardware vector units on general-purpose processors, reducing storage and transfer costs.

Contribution

It presents a novel continuum of reduced-precision formats for floating point data and demonstrates hardware-accelerated implementation via compiler support on GPPs.

Findings

01

Supports lower precision floating point with low overhead

02

Enables reduced storage and transfer volume

03

Achieves acceleration using existing vector hardware

Abstract

We propose a scheme for reduced-precision representation of floating point data on a continuum between IEEE-754 floating point types. Our scheme enables the use of lower precision formats for a reduction in storage space requirements and data transfer volume. We describe how our scheme can be accelerated using existing hardware vector units on a general-purpose processor (GPP). Exploiting native vector hardware allows us to support reduced precision floating point with low overhead. We demonstrate that supporting reduced precision in the compiler as opposed to using a library approach can yield a low overhead solution for GPPs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.