XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Network on RISC-V based IoT End Nodes
Angelo Garofalo, Giuseppe Tagliavini, Francesco Conti, Luca Benini and, Davide Rossi

TL;DR
This paper presents ISA extensions and a parallel RISC-V cluster design to significantly improve energy-efficient quantized neural network inference on IoT microcontrollers, achieving near-linear speedups and high efficiency.
Contribution
It introduces nibble and crumb SIMD instructions and a new execution paradigm for RISC-V, enabling efficient QNN inference on microcontrollers with a scalable multi-core architecture.
Findings
6x to 8x faster QNN kernels with 2-4 bit data operands
Peak efficiency of 2.22 TOPs/s/W comparable to dedicated accelerators
Up to 1000x better energy efficiency than ARM Cortex-M microcontrollers
Abstract
This work introduces lightweight extensions to the RISC-V ISA to boost the efficiency of heavily Quantized Neural Network (QNN) inference on microcontroller-class cores. By extending the ISA with nibble (4-bit) and crumb (2-bit) SIMD instructions, we are able to show near-linear speedup with respect to higher precision integer computation on the key kernels for QNN computation. Also, we propose a custom execution paradigm for SIMD sum-of-dot-product operations, which consists of fusing a dot product with a load operation, with an up to 1.64x peak MAC/cycle improvement compared to a standard execution scenario. To further push the efficiency, we integrate the RISC-V extended core in a parallel cluster of 8 processors, with near-linear improvement with respect to a single core architecture. To evaluate the proposed extensions, we fully implement the cluster of processors in GF22FDX…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Advanced Memory and Neural Computing
