Fine-Tuning GPT-5 for GPU Kernel Generation

Ali Tehrani; Yahya Emara; Essam Wissam; Wojciech Paluch; Waleed Atallah; {\L}ukasz Dudziak; Mohamed S. Abdelfattah

arXiv:2602.11000·cs.DC·February 12, 2026

Fine-Tuning GPT-5 for GPU Kernel Generation

Ali Tehrani, Yahya Emara, Essam Wissam, Wojciech Paluch, Waleed Atallah, {\L}ukasz Dudziak, Mohamed S. Abdelfattah

PDF

Open Access

TL;DR

This paper demonstrates that reinforcement learning fine-tuning of GPT-5 significantly improves GPU kernel generation accuracy and performance, surpassing prior models and traditional compilers in specialized AI hardware tasks.

Contribution

It introduces Makora's RL environment for fine-tuning GPT-5, achieving substantial improvements in GPU kernel correctness and efficiency over baseline models and state-of-the-art benchmarks.

Findings

01

Kernel correctness improved from 43.7% to 77.0%.

02

Outperformed TorchInductor on KernelBench.

03

Achieved 97.4% problem-solving rate in full coding agent.

Abstract

Developing efficient GPU kernels is essential for scaling modern AI systems, yet it remains a complex task due to intricate hardware architectures and the need for specialized optimization expertise. Although Large Language Models (LLMs) demonstrate strong capabilities in general sequential code generation, they face significant challenges in GPU code generation because of the scarcity of high-quality labeled training data, compiler biases when generating synthetic solutions, and limited generalization across hardware generations. This precludes supervised fine-tuning (SFT) as a scalable methodology for improving current LLMs. In contrast, reinforcement learning (RL) offers a data-efficient and adaptive alternative but requires access to relevant tools, careful selection of training problems, and a robust evaluation environment. We present Makora's environment and tools for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Advanced Neural Network Applications · Machine Learning in Materials Science