HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration
Jiaqi Lv, Xufeng He, Yanchen Liu, Xu Dai, Aocheng Shen, Yinghao Li, Jiachen Hao, Jianrong Ding, Yang Hu, Shouyi Yin

TL;DR
This paper introduces HPCTransCompile, a dataset and framework leveraging AI to improve CUDA code transpilation across platforms, and evaluates LLMs' performance in this domain, demonstrating significant speedup improvements.
Contribution
The paper presents a novel dataset and framework for high-performance CUDA transpilation using AI, along with a benchmark for evaluating LLMs in this task, addressing current limitations in workload coverage and generalizability.
Findings
Average speedup of 43.8% in CUDA-to-CPU transpilation.
Effective data augmentation enhances LLM performance.
Benchmark demonstrates LLM potential in high-performance code translation.
Abstract
The rapid growth of deep learning has driven exponential increases in model parameters and computational demands. NVIDIA GPUs and their CUDA-based software ecosystem provide robust support for parallel computing, significantly alleviating computational bottlenecks. Meanwhile, due to the cultivation of user programming habits and the high performance of GPUs, the CUDA ecosystem has established a dominant position in the field of parallel software. This dominance requires other hardware platforms to support CUDA-based software with performance portability. However, translating CUDA code to other platforms poses significant challenges due to differences in parallel programming paradigms and hardware architectures. Existing approaches rely on language extensions, domain-specific languages (DSLs), or compilers but face limitations in workload coverage and generalizability. Moreover, these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Logic, programming, and type systems · Big Data and Digital Economy
