OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization
Arijit Bhattacharjee, Heng Ping, Son Vu Le, Paul Bogdan, Nesreen K. Ahmed, Ali Jannesari

TL;DR
OptiML is an end-to-end framework that synthesizes and optimizes CUDA kernels by combining language models, search, and verification to achieve high performance guided by hardware feedback.
Contribution
This work introduces OptiML, a novel framework that systematically explores and verifies CUDA kernel optimizations using LLMs, search algorithms, and profiler feedback.
Findings
OptiML outperforms baseline LLM approaches in kernel performance.
It produces verified, interpretable optimization trajectories.
OptiML effectively balances synthesis and optimization for diverse CUDA kernels.
Abstract
Generating high-performance CUDA kernels remains challenging due to the need to navigate a combinatorial space of low-level transformations under noisy and expensive hardware feedback. Although large language models can synthesize functionally correct CUDA code, achieving competitive performance requires systematic exploration and verification of optimization choices. We present OptiML, an end-to-end framework that maps either natural-language intent or input CUDA code to performance-optimized CUDA kernels by formulating kernel optimization as search under verification. OptiML consists of two decoupled stages. When the input is natural language, a Mixture-of-Thoughts generator (OptiML-G) acts as a proposal policy over kernel implementation strategies, producing an initial executable program. A search-based optimizer (OptiML-X) then refines either synthesized or user-provided kernels…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Evolutionary Algorithms and Applications
