Equivalence Checking of ML GPU Kernels
Kshitij Dubey, Benjamin Driscoll, Anjiang Wei, Neeraj Kayal, Rahul Sharma, Alex Aiken

TL;DR
This paper introduces VOLTA, the first formal equivalence checker for GPU kernels used in machine learning, capable of verifying the correctness of kernels generated by humans, LLMs, and compilers.
Contribution
It presents a novel equivalence checking method for GPU kernels, ensuring correctness across different kernel generation approaches, including LLMs.
Findings
VOLTA is sound and complete for a class of GPU kernels
Successfully verifies ML kernels like convolutions and matrix multiplications
Enhances reliability of GPU kernel optimizations in ML applications
Abstract
With the rapid progress of deep learning and large language models (LLMs), companies now spend enormous sums executing GPU kernels. These kernels have, therefore, become prime targets for aggressive optimization. Recent efforts increasingly leverage LLMs to generate GPU kernels, but make no formal guarantees about the generated kernels. We present the first equivalence checker for GPU kernels and use it to formally verify the correctness of machine learning (ML) kernels optimized by hand, by LLMs, and by compilers. We show that our equivalence checker is sound and, for a well-defined class of GPU kernels which includes the programs of interest, complete. Our implementation, VOLTA, can verify ML computations such as convolutions, matrix multiplications, and various attention mechanisms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Neural Network Applications · Big Data and Digital Economy
