MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware   Unit for Low-Precision Training on RISC-V cores

Luca Bertaccini; Gianna Paulin; Tim Fischer; Stefan Mach; Luca Benini

arXiv:2207.03192·cs.AR·October 28, 2024

MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V cores

Luca Bertaccini, Gianna Paulin, Tim Fischer, Stefan Mach, Luca Benini

PDF

TL;DR

This paper introduces MiniFloat-NN, an ISA extension for RISC-V supporting low-precision floating-point formats, and an open hardware unit ExSdotp, to enhance energy-efficient neural network training.

Contribution

It presents a novel ISA extension and hardware module for low-precision NN training, including support for multiple FP formats and efficient sum-of-dot-product operations.

Findings

01

Achieves up to 575 GFLOPS/W in low-precision matrix multiplications.

02

Reduces hardware area and critical path by 30% with ExSdotp.

03

Provides an open-source hardware platform for scalable low-precision NN training.

Abstract

Low-precision formats have recently driven major breakthroughs in neural network (NN) training and inference by reducing the memory footprint of the NN models and improving the energy efficiency of the underlying hardware architectures. Narrow integer data types have been vastly investigated for NN inference and have successfully been pushed to the extreme of ternary and binary representations. In contrast, most training-oriented platforms use at least 16-bit floating-point (FP) formats. Lower-precision data types such as 8-bit FP formats and mixed-precision techniques have only recently been explored in hardware implementations. We present MiniFloat-NN, a RISC-V instruction set architecture extension for low-precision NN training, providing support for two 8-bit and two 16-bit FP formats and expanding operations. The extension includes sum-of-dot-product instructions that accumulate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.