Challenges and Design Considerations for Finding CUDA Bugs Through GPU-Native Fuzzing
Mingkai Li, Joseph Devietti, Suman Jana, Tanvir Ahmed Khan

TL;DR
This paper highlights the challenges in detecting CUDA bugs in heterogeneous CPU-GPU systems and advocates for GPU-native fuzzing to improve memory safety verification.
Contribution
It identifies the limitations of current CPU-based fuzzing for GPU programs and proposes design considerations for a GPU-native fuzzing approach to enhance security.
Findings
Current mitigation methods often rely on unfaithful translations.
GPU-native fuzzing can better capture architectural differences.
Memory safety issues in heterogeneous systems are increasing.
Abstract
Modern computing is shifting from homogeneous CPU-centric systems to heterogeneous systems with closely integrated CPUs and GPUs. While the CPU software stack has benefited from decades of memory safety hardening, the GPU software stack remains dangerously immature. This discrepancy presents a critical ethical challenge: the world's most advanced AI and scientific workloads are increasingly deployed on vulnerable hardware components. In this paper, we study the key challenges of ensuring memory safety on heterogeneous systems. We show that, while the number of exploitable bugs in heterogeneous systems rises every year, current mitigation methods often rely on unfaithful translations, i.e., converting GPU programs to run on CPUs for testing, which fails to capture the architectural differences between CPUs and GPUs. We argue that the faithfulness of the program behavior is at the core…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Software Testing and Debugging Techniques · Radiation Effects in Electronics
