TL;DR
PoTAcc is an open-source pipeline that accelerates and evaluates power-of-two quantized DNNs on resource-constrained edge devices, combining model deployment with custom FPGA accelerators for improved performance and energy efficiency.
Contribution
We introduce PoTAcc, a comprehensive pipeline enabling end-to-end deployment and acceleration of PoT-quantized DNNs on heterogeneous edge hardware, including FPGA-based accelerators.
Findings
Achieves up to 3.6x speedup over CPU-only execution.
Reduces energy consumption by up to 78%.
Supports multiple PoT quantization strategies on FPGA platforms.
Abstract
Power-of-two (PoT) quantization significantly reduces the size of deep neural networks (DNNs) and replaces multiplications with bit-shift operations for inference. Prior work has shown that PoT-quantized DNNs can preserve accuracy for tasks such as image classification; however, their performance on resource-constrained edge devices remains insufficiently understood. While general-purpose edge CPUs and GPUs do not provide optimized backends for bit-shift operations, custom hardware accelerators can better exploit PoT quantization by implementing dedicated shift-based processing elements. However, deploying PoT-quantized models on such accelerators is challenging due to limited support in existing inference frameworks. In addition, the impact of different PoT quantization strategies on hardware design, performance, and energy efficiency during full inference has not been systematically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
