Going Further With Winograd Convolutions: Tap-Wise Quantization for Efficient Inference on 4x4 Tile
Renzo Andri, Beatrice Bussolino, Antonio Cipolletta, Lukas Cavigelli,, Zhe Wang

TL;DR
This paper introduces a tap-wise quantization method and custom hardware to enable efficient, integer-only inference with larger Winograd tiles, significantly improving throughput and energy efficiency in neural network convolution operations.
Contribution
The paper presents a novel tap-wise quantization technique and specialized hardware units that unlock the potential of Winograd F4 convolutions for efficient neural network inference.
Findings
Achieves near FP32 accuracy with quantized Winograd F4.
Up to 1.85x energy efficiency gain on hardware.
Up to 1.83x end-to-end speed-up on vision networks.
Abstract
Most of today's computer vision pipelines are built around deep neural networks, where convolution operations require most of the generally high compute effort. The Winograd convolution algorithm computes convolutions with fewer MACs compared to the standard algorithm, reducing the operation count by a factor of 2.25x for 3x3 convolutions when using the version with 2x2-sized tiles . Even though the gain is significant, the Winograd algorithm with larger tile sizes, i.e., , offers even more potential in improving throughput and energy efficiency, as it reduces the required MACs by 4x. Unfortunately, the Winograd algorithm with larger tile sizes introduces numerical issues that prevent its use on integer domain-specific accelerators and higher computational overhead to transform input and output data between spatial and Winograd domains. To unlock the full potential of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · CCD and CMOS Imaging Sensors
MethodsConvolution
