KernelCraft: Benchmarking for Agentic Close-to-Metal Kernel Generation on Emerging Hardware
Jiayi Nie, Haoran Wu, Yao Lai, Zeyu Cao, Cheng Zhang, Binglei Lou, Erwei Wang, Jianyi Cheng, Timothy M. Jones, Robert Mullins, Rika Antonova, Yiren Zhao

TL;DR
KernelCraft introduces a benchmark for evaluating LLM agents in generating and optimizing low-level kernels for emerging hardware accelerators, demonstrating rapid, valid, and efficient kernel production across diverse platforms.
Contribution
This work presents the first benchmark to assess LLM agent capabilities in generating and refining kernels for new ISAs using feedback-driven workflows.
Findings
Agents produce valid kernels within few refinement steps
Optimized kernels match or outperform compiler baselines
Effective across multiple emerging hardware platforms
Abstract
New AI accelerators with novel instruction set architectures (ISAs) often require developers to manually craft low-level kernels -- a time-consuming, laborious, and error-prone process that cannot scale across diverse hardware targets. This prevents emerging hardware platforms from reaching the market efficiently. While prior LLM-based code generation has shown promise in mature GPU ecosystems, it remains unclear whether agentic LLM systems can quickly produce valid and efficient kernels for emerging hardware with new ISAs. We present KernelCraft: the first benchmark to evaluate an LLM agent's ability to generate and optimize low-level kernels for customized accelerators via a function-calling, feedback-driven workflow. Within KernelCraft, the agent refines kernels under ISA and hardware constraints using automated feedback derived from compilation checks, simulation, and correctness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Multi-Agent Systems and Negotiation · Artificial Intelligence in Games
