Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

Yuhao Liu; Salim Ullah; Akash Kumar

arXiv:2602.23334·cs.AR·February 27, 2026

Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

Yuhao Liu, Salim Ullah, Akash Kumar

PDF

Open Access

TL;DR

This paper introduces a runtime-reconfigurable bitwise systolic array architecture for multi-precision quantized neural network acceleration, enabling faster inference and higher clock frequencies on FPGA.

Contribution

It presents a novel hardware design that supports dynamic precision reconfiguration for QNNs, addressing limitations of previous fixed-precision accelerators.

Findings

01

Achieves 1.3185 to 3.5671 times speedup in mixed-precision inference

02

Supports higher clock frequency of 250MHz

03

Reduces critical path delay for improved performance

Abstract

Neural network accelerators have been widely applied to edge devices for complex tasks like object tracking, image recognition, etc. Previous works have explored the quantization technologies in related lightweight accelerator designs to reduce hardware resource consumption. However, low precision leads to high accuracy loss in inference. Therefore, mixed-precision quantization becomes an alternative solution by applying different precision in different layers to trade off resource consumption and accuracy. Because regular designs for multiplication on hardware cannot support the precision reconfiguration for a multi-precision Quantized Neural Network (QNN) model in runtime, we propose a runtime reconfigurable multi-precision multi-channel bitwise systolic array design for QNN accelerators. We have implemented and evaluated our work on the Ultra96 FPGA platform. Results show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Numerical Methods and Algorithms · Network Packet Processing and Optimization