Enabling Mixed-Precision Quantized Neural Networks in Extreme-Edge   Devices

Nazareno Bruschi; Angelo Garofalo; Francesco Conti; Giuseppe; Tagliavini; Davide Rossi

arXiv:2007.07759·cs.AR·July 16, 2020

Enabling Mixed-Precision Quantized Neural Networks in Extreme-Edge Devices

Nazareno Bruschi, Angelo Garofalo, Francesco Conti, Giuseppe, Tagliavini, Davide Rossi

PDF

2 Repos

TL;DR

This paper introduces an extended library for mixed-precision quantized neural networks optimized for ultra-low-power RISC-V microcontrollers, significantly improving inference speed and energy efficiency.

Contribution

It presents a new set of 27 kernels for mixed-precision QNN inference on PULP clusters, enabling efficient deployment on extreme-edge devices.

Findings

01

Achieves 16 MACs/cycle peak performance on 8 cores

02

Performs 21x to 25x faster than ARM Cortex M7-based systems

03

Offers 15x to 21x better energy efficiency

Abstract

The deployment of Quantized Neural Networks (QNN) on advanced microcontrollers requires optimized software to exploit digital signal processing (DSP) extensions of modern instruction set architectures (ISA). As such, recent research proposed optimized libraries for QNNs (from 8-bit to 2-bit) such as CMSIS-NN and PULP-NN. This work presents an extension to the PULP-NN library targeting the acceleration of mixed-precision Deep Neural Networks, an emerging paradigm able to significantly shrink the memory footprint of deep neural networks with negligible accuracy loss. The library, composed of 27 kernels, one for each permutation of input feature maps, weights, and output feature maps precision (considering 8-bit, 4-bit and 2-bit), enables efficient inference of QNN on parallel ultra-low-power (PULP) clusters of RISC-V based processors, featuring the RV32IMCXpulpV2 ISA. The proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.