Developing a BLAS library for the AMD AI Engine
Tristan Laan, Tiziano De Matteis

TL;DR
This paper introduces AIEBLAS, an open-source BLAS library optimized for AMD AI Engine, enabling easier and more flexible use of spatial architectures for scientific computing beyond machine learning.
Contribution
The paper presents a reusable, customizable BLAS implementation tailored for AMD AI Engine, facilitating broader scientific application development on spatial architectures.
Findings
Open-source implementation of BLAS for AMD AI Engine
Designed for easy reuse and customization
Supports scientific computing beyond ML workloads
Abstract
Spatial (dataflow) computer architectures can mitigate the control and performance overhead of classical von Neumann architectures such as traditional CPUs. Driven by the popularity of Machine Learning (ML) workloads, spatial devices are being marketed as ML inference accelerators. Despite providing a rich software ecosystem for ML practitioners, their adoption in other scientific domains is hindered by the steep learning curve and lack of reusable software, which makes them inaccessible to non-experts. We present our ongoing project AIEBLAS, an open-source, expandable implementation of Basic Linear Algebra Routines (BLAS) for the AMD AI Engine. Numerical routines are designed to be easily reusable, customized, and composed in dataflow programs, leveraging the characteristics of the targeted device without requiring the user to deeply understand the underlying hardware and programming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques
