Autocomp: A Powerful and Portable Code Optimizer for Tensor Accelerators

Charles Hong; Sahil Bhatia; Alvin Cheung; Yakun Sophia Shao

arXiv:2505.18574·cs.PL·November 7, 2025

Autocomp: A Powerful and Portable Code Optimizer for Tensor Accelerators

Charles Hong, Sahil Bhatia, Alvin Cheung, Yakun Sophia Shao

PDF

Open Access

TL;DR

Autocomp is an innovative LLM-driven approach that automates and enhances code optimization for tensor accelerators, significantly improving performance across multiple hardware platforms by leveraging domain knowledge and hardware feedback.

Contribution

The paper introduces Autocomp, a novel method combining structured prompts, domain knowledge, and hardware feedback to automate tensor accelerator code optimization, outperforming existing solutions.

Findings

01

Autocomp-optimized code is 5.6x faster than vendor libraries.

02

Outperforms expert hand-tuned code by 1.9x.

03

Achieves 3.8x higher performance than ML-based cost models.

Abstract

Hardware accelerators, especially those designed for tensor processing, have become ubiquitous in today's computing landscape. However, even with significant efforts in building compilers, programming these tensor accelerators remains challenging, leaving much of their potential underutilized. Recently, large language models (LLMs), trained on large amounts of code, have shown significant promise in code generation and optimization tasks, but generating low-resource languages, such as specialized tensor accelerator code still poses a significant challenge. We tackle this challenge with Autocomp, an approach that empowers accelerator programmers to leverage domain knowledge and hardware feedback to optimize code via an automated LLM-driven search. We accomplish this by: 1) formulating each optimization pass as a structured two-phase prompt, divided into planning and code generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Parallel Computing and Optimization Techniques · Advanced Neural Network Applications