Enabling Sparse Winograd Convolution by Native Pruning
Sheng Li, Jongsoo Park, Ping Tak Peter Tang

TL;DR
This paper introduces a method to train sparse Winograd convolution kernels directly, achieving high sparsity and significant speedups in CNNs, demonstrated on AlexNet with minimal accuracy loss.
Contribution
It proposes a novel Winograd layer for native pruning of Winograd coefficients, enabling high sparsity and efficient sparse convolution implementation.
Findings
Achieved over 90% sparsity with only 0.1% accuracy loss on AlexNet.
Realized up to 31.7 TFLOP/s in 32-bit on Intel Xeon CPU.
Provided a sparse Winograd convolution algorithm that accelerates CNN inference.
Abstract
Sparse methods and the use of Winograd convolutions are two orthogonal approaches, each of which significantly accelerates convolution computations in modern CNNs. Sparse Winograd merges these two and thus has the potential to offer a combined performance benefit. Nevertheless, training convolution layers so that the resulting Winograd kernels are sparse has not hitherto been very successful. By introducing a Winograd layer in place of a standard convolution layer, we can learn and prune Winograd coefficients "natively" and obtain sparsity level beyond 90% with only 0.1% accuracy loss with AlexNet on ImageNet dataset. Furthermore, we present a sparse Winograd convolution algorithm and implementation that exploits the sparsity, achieving up to 31.7 effective TFLOP/s in 32-bit precision on a latest Intel Xeon CPU, which corresponds to a 5.4x speedup over a state-of-the-art dense…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Vision and Imaging · Image Enhancement Techniques
Methods1x1 Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax · How do I speak to a person at Expedia?-/+/ · Convolution
