High Performance Convolution Using Sparsity and Patterns for Inference in Deep Convolutional Neural Networks
Hossam Amer, Ahmed H. Salamah, Ahmad Sajedi, En-hui Yang

TL;DR
This paper introduces two novel convolution algorithms, CPO and CPS, that leverage sparsity in activation maps to significantly reduce memory usage and accelerate inference in deep CNNs without sacrificing accuracy.
Contribution
The paper proposes CPO and CPS algorithms that exploit activation map sparsity for more efficient convolution, outperforming traditional methods in speed and compression.
Findings
Up to 63% per-layer time savings on CPUs.
Compression ratio up to 26x compared to im2col.
Inference time savings up to 9% with 10x compression.
Abstract
Deploying deep Convolutional Neural Networks (CNNs) is impacted by their memory footprint and speed requirements, which mainly come from convolution. Widely-used convolution algorithms, im2col and MEC, produce a lowered matrix from an activation map by redundantly storing the map's elements included at horizontal and/or vertical kernel overlappings without considering the sparsity of the map. Using the sparsity of the map, this paper proposes two new convolution algorithms dubbed Compressed Pattern Overlap (CPO) and Compressed Pattern Sets (CPS) that simultaneously decrease the memory footprint and increase the inference speed while preserving the accuracy. CPO recognizes non-zero elements (NZEs) at horizontal and vertical overlappings in the activation maps. CPS further improves the memory savings of CPO by compressing the index positions of neighboring NZEs. In both algorithms,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Sparse and Compressive Sensing Techniques
MethodsConvolution
