FSCNN: A Fast Sparse Convolution Neural Network Inference System
Bo Ji, Tianyi Chen

TL;DR
FSCNN is an efficient inference system that leverages fine-grained sparsity in compressed CNNs to accelerate forward passes, outperforming standard libraries like PyTorch at high sparsity levels.
Contribution
The paper introduces FSCNN, a specialized system with data structures and algorithms designed to utilize fine-grained sparsity for faster CNN inference.
Findings
FSCNN outperforms PyTorch on VGG16 with high sparsity.
Sparse operators face contiguity issues limiting speedup.
Structured sparsity remains preferable for general model compression.
Abstract
Convolution neural networks (CNNs) have achieved remarkable success, but typically accompany high computation cost and numerous redundant weight parameters. To reduce the FLOPs, structure pruning is a popular approach to remove the entire hidden structures via introducing coarse-grained sparsity. Meanwhile, plentiful pruning works leverage fine-grained sparsity instead (sparsity are randomly distributed), whereas their sparse models lack special designed computing library for potential speedup. In this technical report, we study and present an efficient convolution neural network inference system to accelerate its forward pass by utilizing the fine-grained sparsity of compressed CNNs. Our developed FSCNN is established based on a set of specialized designed sparse data structures, operators and associated algorithms. Experimentally, we validate that FSCNN outperforms standard deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Machine Learning and ELM
MethodsLib · Pruning · Convolution
