Towards Automated Kernel Generation in the Era of LLMs

Yang Yu; Peiyu Zang; Chi Hsu Tsai; Haiming Wu; Yixin Shen; Jialing Zhang; Haoyu Wang; Zhiyou Xiao; Jingze Shi; Yuyu Luo; Wentao Zhang; Chunlei Men; Guang Liu; Yonghua Lin

arXiv:2601.15727·cs.LG·January 27, 2026

Towards Automated Kernel Generation in the Era of LLMs

Yang Yu, Peiyu Zang, Chi Hsu Tsai, Haiming Wu, Yixin Shen, Jialing Zhang, Haoyu Wang, Zhiyou Xiao, Jingze Shi, Yuyu Luo, Wentao Zhang, Chunlei Men, Guang Liu, Yonghua Lin

PDF

Open Access

TL;DR

This paper surveys recent advances in automating kernel generation using large language models and agentic systems, highlighting current approaches, datasets, benchmarks, challenges, and future directions.

Contribution

It provides a systematic overview of LLM-driven kernel generation methods, compiling key datasets and benchmarks, and outlines open challenges for future research.

Findings

01

LLMs can effectively encode expert kernel knowledge

02

Agentic systems enable iterative, feedback-driven kernel optimization

03

The field is rapidly evolving but lacks a unified framework

Abstract

The performance of modern AI systems is fundamentally constrained by the quality of their underlying kernels, which translate high-level algorithmic semantics into low-level hardware operations. Achieving near-optimal kernels requires expert-level understanding of hardware architectures and programming models, making kernel engineering a critical but notoriously time-consuming and non-scalable process. Recent advances in large language models (LLMs) and LLM-based agents have opened new possibilities for automating kernel generation and optimization. LLMs are well-suited to compress expert-level kernel knowledge that is difficult to formalize, while agentic systems further enable scalable optimization by casting kernel development as an iterative, feedback-driven loop. Rapid progress has been made in this area. However, the field remains fragmented, lacking a systematic perspective for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Machine Learning and Data Classification · Advanced Neural Network Applications