Matching Linear Algebra and Tensor Code to Specialized Hardware Accelerators
Pablo Antonio Mart\'inez, Jackson Woodruff, Jordi, Armengol-Estap\'e, Gregorio Bernab\'e, Jos\'e Manuel Garc\'ia and, Michael F. P. O'Boyle

TL;DR
This paper introduces ATC, a compiler that uses program synthesis and advanced analysis techniques to automatically map linear algebra code to specialized tensor hardware, significantly improving performance.
Contribution
The paper presents ATC, a novel compiler that effectively automates code mapping to tensor accelerators using synthesis, classification, and analysis, overcoming fragility of pattern-matching methods.
Findings
Accelerates 2.6x to 7x more programs than previous methods
Achieves over an order of magnitude performance improvement
Successfully applied to real-world tensor and linear algebra codes
Abstract
Dedicated tensor accelerators demonstrate the importance of linear algebra in modern applications. Such accelerators have the potential for impressive performance gains, but require programmers to rewrite code using vendor APIs - a barrier to wider scale adoption. Recent work overcomes this by matching and replacing patterns within code, but such approaches are fragile and fail to cope with the diversity of real-world codes. We develop ATC, a compiler that uses program synthesis to map regions of code to specific APIs. The mapping space that ATC explores is combinatorially large, requiring the development of program classification, dynamic analysis, variable constraint generation and lexical distance matching techniques to make it tractable. We apply ATC to real-world tensor and linear algebra codes and evaluate them against four state-of-the-art approaches. We accelerate between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
