Accelerating Large Kernel Convolutions with Nested Winograd Transformation.pdf
Jingbo Jiang, Xizi Chen, Chi-Ying Tsui

TL;DR
This paper introduces a nested Winograd algorithm that efficiently accelerates large kernel convolutions in CNNs by iteratively decomposing them into smaller kernels, significantly reducing multiplications and improving computational efficiency.
Contribution
It proposes a novel nested Winograd transformation that outperforms linear decomposition methods for large kernel convolutions in CNNs.
Findings
Reduces multiplications by 1.4 to 10.5 times for 4x4 to 31x31 convolutions.
Proves the nested Winograd algorithm is more effective than linear decomposition.
Enhances the efficiency of large kernel CNNs in computer vision tasks.
Abstract
Recent literature has shown that convolutional neural networks (CNNs) with large kernels outperform vision transformers (ViTs) and CNNs with stacked small kernels in many computer vision tasks, such as object detection and image restoration. The Winograd transformation helps reduce the number of repetitive multiplications in convolution and is widely supported by many commercial AI processors. Researchers have proposed accelerating large kernel convolutions by linearly decomposing them into many small kernel convolutions and then sequentially accelerating each small kernel convolution with the Winograd algorithm. This work proposes a nested Winograd algorithm that iteratively decomposes a large kernel convolution into small kernel convolutions and proves it to be more effective than the linear decomposition Winograd transformation algorithm. Experiments show that compared to the linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution
