Energy Efficient Hardware Acceleration of Neural Networks with Power-of-Two Quantisation
Dominika Przewlocka-Rus, Tomasz Kryjak

TL;DR
This paper presents a hardware neural network accelerator using Power-of-Two quantisation, achieving at least 1.4x energy efficiency improvement over uniform quantisation, with additional power savings via a new pruning method.
Contribution
It introduces a hardware implementation of PoT quantisation on FPGA and a novel pruning technique tailored for logarithmic quantisation, enhancing energy efficiency.
Findings
At least 1.4x energy efficiency gain with PoT quantisation.
Effective pruning method reduces power by skipping zero weights.
Hardware implementation on Zynq UltraScale+ FPGA demonstrates practical benefits.
Abstract
Deep neural networks virtually dominate the domain of most modern vision systems, providing high performance at a cost of increased computational complexity.Since for those systems it is often required to operate both in real-time and with minimal energy consumption (e.g., for wearable devices or autonomous vehicles, edge Internet of Things (IoT), sensor networks), various network optimisation techniques are used, e.g., quantisation, pruning, or dedicated lightweight architectures. Due to the logarithmic distribution of weights in neural network layers, a method providing high performance with significant reduction in computational precision (for 4-bit weights and less) is the Power-of-Two (PoT) quantisation (and therefore also with a logarithmic distribution). This method introduces additional possibilities of replacing the typical for neural networks Multiply and ACcumulate (MAC --…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · CCD and CMOS Imaging Sensors
MethodsPruning · Convolution
