A High-Level Compiler Integration Approach for Deep Learning Accelerators Supporting Abstraction and Optimization

Samira Ahmadifarsani; Daniel Mueller-Gritschneder; Ulf Schlichtmann

arXiv:2507.04828·cs.LG·July 8, 2025

A High-Level Compiler Integration Approach for Deep Learning Accelerators Supporting Abstraction and Optimization

Samira Ahmadifarsani, Daniel Mueller-Gritschneder, Ulf Schlichtmann

PDF

TL;DR

This paper presents a high-level compiler integration method for deep learning accelerators that simplifies integration and automates optimization, achieving performance comparable to specialized manual tools.

Contribution

It introduces a TVM-based approach that abstracts integration complexities and incorporates automated tensor scheduling for GEMM-based accelerators.

Findings

01

Seamless integration of accelerators without deep compiler knowledge

02

Automated tensor scheduling with design space exploration

03

Performance comparable to manual toolchains on Gemmini

Abstract

The growing adoption of domain-specific architectures in edge computing platforms for deep learning has highlighted the efficiency of hardware accelerators. However, integrating custom accelerators into modern machine learning (ML) compilers remains a complex challenge due to the need for significant modifications in compilation layers and specialized scheduling techniques. Existing frameworks offer partial solutions and require users to navigate intricate compiler internals. In this paper, we introduce a TVM-based compilation integration approach that targets GEMM-based deep learning accelerators. Our approach abstracts the complexities of compiler integration, enabling seamless integration of accelerators without requiring in-depth knowledge of the underlying compiler. Furthermore, we extend and incorporate design space exploration tools, specifically CoSA, to automate efficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.