Extending Sparse Tensor Accelerators to Support Multiple Compression Formats
Eric Qin, Geonhwa Jeong, William Won, Sheng-Chun Kao, Hyoukjun Kwon,, Sudarshan Srinivasan, Dipankar Das, Gordon E. Moon, Sivasankaran, Rajamanickam, Tushar Krishna

TL;DR
This paper introduces hardware extensions to tensor accelerators that support multiple compression formats, enabling efficient processing of diverse sparse data and achieving significant speedups over software conversions.
Contribution
It presents a novel hardware extension allowing accelerators to handle various compression formats seamlessly, improving efficiency across different sparsity and tensor dimensions.
Findings
Achieves approximately 4x speedup over software format conversions
Supports multiple compression formats in hardware for diverse workloads
Enhances accelerator flexibility and efficiency in sparse tensor processing
Abstract
Sparsity, which occurs in both scientific applications and Deep Learning (DL) models, has been a key target of optimization within recent ASIC accelerators due to the potential memory and compute savings. These applications use data stored in a variety of compression formats. We demonstrate that both the compactness of different compression formats and the compute efficiency of the algorithms enabled by them vary across tensor dimensions and amount of sparsity. Since DL and scientific workloads span across all sparsity regions, there can be numerous format combinations for optimizing memory and compute efficiency. Unfortunately, many proposed accelerators operate on one or two fixed format combinations. This work proposes hardware extensions to accelerators for supporting numerous format combinations seamlessly and demonstrates ~4X speedup over performing format conversions in software.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
