SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Zihao Ye, Ruihang Lai, Junru Shao, Tianqi Chen, Luis Ceze

TL;DR
SparseTIR introduces a composable abstraction for sparse tensor compilation in deep learning, enabling flexible formats and transformations that significantly improve performance across various operators and end-to-end workloads.
Contribution
It proposes SparseTIR, a novel sparse tensor compilation framework with composable formats and transformations, addressing limitations of single-format and single-shot compilers.
Findings
Achieves 1.20-2.34x speedup on GPU for GNN operators
Attains 1.05-2.98x speedup for sparse attention operators
Provides 4.20-40.18x acceleration for RGCN inference
Abstract
Sparse tensors are rapidly becoming critical components of modern deep learning workloads. However, developing high-performance sparse operators can be difficult and tedious, and existing vendor libraries cannot satisfy the escalating demands from new operators. Sparse tensor compilers simplify the development of operators, but efficient sparse compilation for deep learning remains challenging because a single sparse format cannot maximize hardware efficiency, and single-shot compilers cannot keep up with latest hardware and system advances. In this paper, we observe that the key to addressing both these challenges is to leverage composable formats and composable transformations. We propose SparseTIR, a sparse tensor compilation abstraction that offers composable formats and composable transformations for deep learning workloads. SparseTIR constructs a search space over these composable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Tensor decomposition and applications
MethodsAttention Is All You Need · Convolution · Linear Layer · Attention Dropout · Layer Normalization · Adam · Cosine Annealing · Linear Warmup With Cosine Annealing · Weight Decay · Softmax
