Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI Acceleration
Endri Taka, Ning-Chi Huang, Chi-Chih Chang, Kai-Chiang Wu, Aman Arora,, Diana Marculescu

TL;DR
This paper introduces FPGA in-fabric blocks called SST slices supporting structured sparsity at multiple levels, enabling efficient acceleration of various DNNs with significant speedup and area reduction.
Contribution
It proposes 2D systolic sparse tensor slices supporting multiple sparsity levels, enhancing FPGA-based DNN acceleration for both dense and sparse models.
Findings
Up to 5x higher FPGA frequency with SSTs.
Up to 10.9x lower area compared to traditional FPGA designs.
Up to 3.52x speedup on sparse DNN models.
Abstract
FPGA architectures have recently been enhanced to meet the substantial computational demands of modern deep neural networks (DNNs). To this end, both FPGA vendors and academic researchers have proposed in-fabric blocks that perform efficient tensor computations. However, these blocks are primarily optimized for dense computation, while most DNNs exhibit sparsity. To address this limitation, we propose incorporating structured sparsity support into FPGA architectures. We architect 2D systolic in-fabric blocks, named systolic sparse tensor (SST) slices, that support multiple degrees of sparsity to efficiently accelerate a wide variety of DNNs. SSTs support dense operation, 2:4 (50%) and 1:4 (75%) sparsity, as well as a new 1:3 (66.7%) sparsity level to further increase flexibility. When demonstrating on general matrix multiplication (GEMM) accelerators, which are the heart of most current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Tensor decomposition and applications
