A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
Ian Colbert, Alessandro Pappalardo, Jakoba Petri-Koenig

TL;DR
A2Q introduces a weight quantization method that ensures low-precision accumulators in neural networks do not overflow, maintaining accuracy while significantly reducing resource usage, especially on FPGA hardware.
Contribution
The paper proposes a novel accumulator-aware quantization technique that constrains weight norms to prevent overflow, enabling efficient low-precision neural network inference.
Findings
A2Q achieves up to 2.3x resource reduction on FPGA accelerators.
Maintains 99.2% of floating-point accuracy with low-precision accumulators.
Effective across multiple computer vision benchmarks.
Abstract
We present accumulator-aware quantization (A2Q), a novel weight quantization method designed to train quantized neural networks (QNNs) to avoid overflow when using low-precision accumulators during inference. A2Q introduces a unique formulation inspired by weight normalization that constrains the L1-norm of model weights according to accumulator bit width bounds that we derive. Thus, in training QNNs for low-precision accumulation, A2Q also inherently promotes unstructured weight sparsity to guarantee overflow avoidance. We apply our method to deep learning-based computer vision tasks to show that A2Q can train QNNs for low-precision accumulators while maintaining model accuracy competitive with a floating-point baseline. In our evaluations, we consider the impact of A2Q on both general-purpose platforms and programmable hardware. However, we primarily target model deployment on FPGAs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Image Enhancement Techniques
MethodsWeight Normalization
