HiKonv: Maximizing the Throughput of Quantized Convolution With Novel Bit-wise Management and Computation
Yao Chen, Junhao Pan, Xinheng Liu, Jinjun Xiong, Deming Chen

TL;DR
HiKonv introduces a novel approach to maximize convolution throughput in quantized CNNs by leveraging bit-wise management and parallel computation, significantly improving performance on CPU and FPGA hardware.
Contribution
It presents a unified framework and performance models for high-throughput low-bitwidth convolution using existing full-bitwidth units, enabling substantial hardware efficiency gains.
Findings
Single CPU 32-bit unit can perform multiple low-bitwidth convolutions simultaneously.
HiKonv achieves up to 7.6x performance improvement on CPU for 1-D convolution.
On FPGA, HiKonv enhances throughput and DSP efficiency, outperforming previous models.
Abstract
Quantization for CNN has shown significant progress with the intention of reducing the cost of computation and storage with low-bitwidth data representations. There are, however, no systematic studies on how an existing full-bitwidth processing unit, such as ALU in CPUs and DSP in FPGAs, can be better utilized to deliver significantly higher computation throughput for convolution under various quantized bitwidths. In this study, we propose HiKonv, a unified solution that maximizes the throughput of convolution on a given underlying processing unit with low-bitwidth quantized data inputs through novel bit-wise management and parallel computation. We establish theoretical framework and performance models using a full-bitwidth multiplier for highly parallelized low-bitwidth convolution, and demonstrate new breakthroughs for high-performance computing in this critical domain. For example, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Cell Image Analysis Techniques · Advanced Image and Video Retrieval Techniques
MethodsConvolution
