Sense: Model Hardware Co-design for Accelerating Sparse CNN on Systolic Array
Wenhao Sun, Deng Liu, Zhiwei Zou, Wendi Sun, Yi Kang, Song Chen

TL;DR
Sense is a systolic-array-based architecture for sparse CNN acceleration that uses model-hardware co-design, including channel clustering, load-balancing pruning, and adaptive dataflow, to significantly improve performance and energy efficiency.
Contribution
It introduces a co-designed hardware architecture with novel load balancing and dataflow strategies tailored for sparse CNNs on systolic arrays.
Findings
Achieves up to 2.25x performance improvement over existing accelerators.
Reduces DRAM access by up to 1.8x, lowering energy consumption.
Performs efficiently on multiple CNN models like AlexNet and ResNet-50.
Abstract
Sparsity is an intrinsic property of convolutional neural network(CNN) and worth exploiting for CNN accelerators, but extra processing comes with hardware overhead, causing many architectures suffering from only minor profit. Meanwhile, systolic array has been increasingly competitive on CNNs acceleration for its high spatiotemporal locality and low hardware overhead. However, the irregularity of sparsity induces imbalanced workload under the rigid systolic dataflow, causing performance degradation. Thus, this paper proposed a systolicarray-based architecture, called Sense, for sparse CNN acceleration by model-hardware co-design, achieving large performance improvement. To balance input feature map(IFM) and weight loads across Processing Element(PE) array, we applied channel clustering to gather IFMs with approximate sparsity for array computation, and co-designed a load-balancing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Brain Tumor Detection and Classification
