HiKonv: High Throughput Quantized Convolution With Novel Bit-wise Management and Computation
Xinheng Liu, Yao Chen, Prakhar Ganesh, Junhao Pan, Jinjun Xiong,, Deming Chen

TL;DR
HiKonv introduces a novel bit-wise parallel computation method that significantly enhances the throughput of quantized CNN convolution operations on existing hardware like CPUs and DSPs, enabling faster and more efficient low-bitwidth neural network processing.
Contribution
The paper presents a unified approach to maximize compute throughput for quantized CNNs on existing processors using novel bit-wise parallel computation techniques.
Findings
Single 32-bit unit can perform 128 binarized convolutions per instruction.
A 27x18 DSP core can execute 8 four-bit convolutions per cycle.
Achieves 3.17x latency reduction on CPU and 2.37x throughput increase on FPGA.
Abstract
Quantization for Convolutional Neural Network (CNN) has shown significant progress with the intention of reducing the cost of computation and storage with low-bitwidth data inputs. There are, however, no systematic studies on how an existing full-bitwidth processing unit, such as CPUs and DSPs, can be better utilized to carry out significantly higher computation throughput for convolution under various quantized bitwidths. In this study, we propose HiKonv, a unified solution that maximizes the compute throughput of a given underlying processing unit to process low-bitwidth quantized data inputs through novel bit-wise parallel computation. We establish theoretical performance bounds using a full-bitwidth multiplier for highly parallelized low-bitwidth convolution, and demonstrate new breakthroughs for high-performance computing in this critical domain. For example, a single 32-bit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Advanced Image and Video Retrieval Techniques
MethodsConvolution
