GPUArmor: A Hardware-Software Co-design for Efficient and Scalable Memory Safety on GPUs
Mohamed Tarek Ibn Ziad, Sana Damani, Mark Stephenson, Stephen W., Keckler, Aamer Jaleel

TL;DR
GPUArmor is a hardware-software co-design that enhances memory safety on GPUs with minimal performance and storage overheads, addressing scalability issues of previous solutions.
Contribution
It introduces a lightweight hardware support combined with compiler analysis for scalable, efficient memory safety on GPUs, including a no-recompilation variant.
Findings
Achieves 2.3% average runtime overheads on modern GPU workloads.
GPUArmor-HWOnly reduces storage overheads significantly.
Maintains high performance with negligible overheads.
Abstract
Memory safety errors continue to pose a significant threat to current computing systems, and graphics processing units (GPUs) are no exception. A prominent class of memory safety algorithms is allocation-based solutions. The key idea is to maintain each allocation's metadata (base address and size) in a disjoint table and retrieve it at runtime to verify memory accesses. While several previous solutions have adopted allocation-based algorithms (e.g., cuCatch and GPUShield), they typically suffer from high memory overheads or scalability problems. In this work, we examine the key characteristics of real-world GPU workloads and observe several differences between GPU and CPU applications regarding memory access patterns, memory footprint, number of live allocations, and active allocation working set. Our observations motivate GPUArmor, a hardware-software co-design framework for memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Security and Verification in Computing · Distributed systems and fault tolerance
