Energy Efficient Hardware Acceleration of Neural Networks with   Power-of-Two Quantisation

Dominika Przewlocka-Rus; Tomasz Kryjak

arXiv:2209.15257·cs.CV·November 14, 2023

Energy Efficient Hardware Acceleration of Neural Networks with Power-of-Two Quantisation

Dominika Przewlocka-Rus, Tomasz Kryjak

PDF

Open Access

TL;DR

This paper presents a hardware neural network accelerator using Power-of-Two quantisation, achieving at least 1.4x energy efficiency improvement over uniform quantisation, with additional power savings via a new pruning method.

Contribution

It introduces a hardware implementation of PoT quantisation on FPGA and a novel pruning technique tailored for logarithmic quantisation, enhancing energy efficiency.

Findings

01

At least 1.4x energy efficiency gain with PoT quantisation.

02

Effective pruning method reduces power by skipping zero weights.

03

Hardware implementation on Zynq UltraScale+ FPGA demonstrates practical benefits.

Abstract

Deep neural networks virtually dominate the domain of most modern vision systems, providing high performance at a cost of increased computational complexity.Since for those systems it is often required to operate both in real-time and with minimal energy consumption (e.g., for wearable devices or autonomous vehicles, edge Internet of Things (IoT), sensor networks), various network optimisation techniques are used, e.g., quantisation, pruning, or dedicated lightweight architectures. Due to the logarithmic distribution of weights in neural network layers, a method providing high performance with significant reduction in computational precision (for 4-bit weights and less) is the Power-of-Two (PoT) quantisation (and therefore also with a logarithmic distribution). This method introduces additional possibilities of replacing the typical for neural networks Multiply and ACcumulate (MAC --…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · CCD and CMOS Imaging Sensors

MethodsPruning · Convolution