Sparse Periodic Systolic Dataflow for Lowering Latency and Power Dissipation of Convolutional Neural Network Accelerators
Jung Hwan Heo, Arash Fayyazi, Amirhossein Esmaili, Massoud Pedram

TL;DR
This paper presents the SPS dataflow, a hardware design that leverages periodic pattern-based sparsity to significantly reduce energy consumption and storage in CNN accelerators without accuracy loss.
Contribution
The paper introduces the SPS dataflow and a sparsity-aware compiler that exploit regular sparsity patterns for efficient CNN acceleration on FPGA.
Findings
4.49x energy efficiency improvement
3.67x reduction in total weight storage
22,044x reduction in indexing memory
Abstract
This paper introduces the sparse periodic systolic (SPS) dataflow, which advances the state-of-the-art hardware accelerator for supporting lightweight neural networks. Specifically, the SPS dataflow enables a novel hardware design approach unlocked by an emergent pruning scheme, periodic pattern-based sparsity (PPS). By exploiting the regularity of PPS, our sparsity-aware compiler optimally reorders the weights and uses a simple indexing unit in hardware to create matches between the weights and activations. Through the compiler-hardware codesign, SPS dataflow enjoys higher degrees of parallelism while being free of the high indexing overhead and without model accuracy loss. Evaluated on popular benchmarks such as VGG and ResNet, the SPS dataflow and accompanying neural network compiler outperform prior work in convolutional neural network (CNN) accelerator designs targeting FPGA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning · Average Pooling · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Softmax · Residual Connection · Global Average Pooling · 1x1 Convolution · Dense Connections
