Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation
William F. Godoy, Pedro Valero-Lara, Keita Teranishi, Prasanna, Balaprakash, Jeffrey S. Vetter

TL;DR
This paper assesses OpenAI Codex's ability to generate HPC kernels across multiple programming models, revealing strengths in mature models like OpenMP and CUDA, and highlighting areas for improvement in others like HIP.
Contribution
It provides a comprehensive evaluation of AI-generated HPC kernels across diverse languages and models, introducing a proficiency metric for comparison.
Findings
OpenAI Codex performs well with mature models like OpenMP and CUDA.
Prompt keyword addition improves code generation in Fortran and Python.
Performance varies across programming models, reflecting their maturity.
Abstract
We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offload] and OpenACC), (3) Python (e.g., numba, Numba, cuPy, and pyCUDA), and (4) Julia (e.g., Threads, CUDA.jl, AMDGPU.jl, and KernelAbstractions.jl). We use the GitHub Copilot capabilities powered by OpenAI Codex available in Visual Studio Code as of April 2023 to generate a vast amount of implementations given simple <kernel> + <programming model> + <optional hints> prompt variants. To quantify and compare the results, we propose a proficiency metric around the initial 10 suggestions given for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
