Pushing Tensor Accelerators Beyond MatMul in a User-Schedulable Language
Yihong Zhang, Derek Gerstmann, Andrew Adams, Maaz Bin Safeer Ahmad

TL;DR
This paper presents a compiler-based approach using Halide to program tensor accelerators beyond traditional matrix multiplication, enabling diverse applications like image processing to achieve significant speedups.
Contribution
It introduces a flexible tensor instruction selector based on equality saturation within Halide, allowing easy programming of tensor accelerators for various workloads.
Findings
Achieved up to 6.1x speedup on image processing pipelines
Demonstrated the potential of tensor accelerators beyond ML tasks
Enabled concise application development with a few dozen lines of code
Abstract
Tensor accelerators now represent a growing share of compute resources in modern CPUs and GPUs. However, they are hard to program, leading developers to use vendor-provided kernel libraries that support tensor accelerators. As a result, the usage of tensor accelerators is limited to the provided interface, mainly designed for traditional ML and scientific computing workloads. In this paper, we show that tensor accelerators can improve the performance of applications beyond simple variants of MatMul. For example, many image processing pipelines are linear transformations over matrices in disguise and can therefore utilize such specialized hardware. This is nonetheless hindered by the difficulties in programming tensor accelerators. We tackle this problem with compiler-based techniques. We use the Halide user-schedulable language and express operations as Halide algorithms succinctly.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Parallel Computing and Optimization Techniques · Graph Theory and Algorithms
