Enabling Sparse Winograd Convolution by Native Pruning

Sheng Li; Jongsoo Park; Ping Tak Peter Tang

arXiv:1702.08597·cs.CV·October 17, 2017·47 cites

Enabling Sparse Winograd Convolution by Native Pruning

Sheng Li, Jongsoo Park, Ping Tak Peter Tang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to train sparse Winograd convolution kernels directly, achieving high sparsity and significant speedups in CNNs, demonstrated on AlexNet with minimal accuracy loss.

Contribution

It proposes a novel Winograd layer for native pruning of Winograd coefficients, enabling high sparsity and efficient sparse convolution implementation.

Findings

01

Achieved over 90% sparsity with only 0.1% accuracy loss on AlexNet.

02

Realized up to 31.7 TFLOP/s in 32-bit on Intel Xeon CPU.

03

Provided a sparse Winograd convolution algorithm that accelerates CNN inference.

Abstract

Sparse methods and the use of Winograd convolutions are two orthogonal approaches, each of which significantly accelerates convolution computations in modern CNNs. Sparse Winograd merges these two and thus has the potential to offer a combined performance benefit. Nevertheless, training convolution layers so that the resulting Winograd kernels are sparse has not hitherto been very successful. By introducing a Winograd layer in place of a standard convolution layer, we can learn and prune Winograd coefficients "natively" and obtain sparsity level beyond 90% with only 0.1% accuracy loss with AlexNet on ImageNet dataset. Furthermore, we present a sparse Winograd convolution algorithm and implementation that exploits the sparsity, achieving up to 31.7 effective TFLOP/s in 32-bit precision on a latest Intel Xeon CPU, which corresponds to a 5.4x speedup over a state-of-the-art dense…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IntelLabs/SkimCaffe
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Vision and Imaging · Image Enhancement Techniques

Methods1x1 Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax · How do I speak to a person at Expedia?-/+/ · Convolution