Optimising GPGPU Execution Through Runtime Micro-Architecture Parameter Analysis
Giuseppe M. Sarda, Nimish Shah, Debjyoti Bhattacharjee, Peter, Debacker, Marian Verhelst

TL;DR
This paper introduces a hardware-aware runtime mapping technique for open-source GPGPU platforms that optimizes performance by analyzing micro-architecture parameters, surpassing traditional hardware-agnostic methods.
Contribution
It presents a novel micro-architecture parameter analysis approach for runtime kernel mapping on open-source GPGPUs, improving performance and resource utilization.
Findings
Significant performance improvements on Vortex GPGPU
Effective optimization across various GPU configurations
Enhanced hardware resource utilization
Abstract
GPGPU execution analysis has always been tied to closed-source, proprietary benchmarking tools that provide high-level, non-exhaustive, and/or statistical information, preventing a thorough understanding of bottlenecks and optimization possibilities. Open-source hardware platforms offer opportunities to overcome such limits and co-optimize the full {hardware-mapping-algorithm} compute stack. Yet, so far, this has remained under-explored. In this work, we exploit micro-architecture parameter analysis to develop a hardware-aware, runtime mapping technique for OpenCL kernels on the open Vortex RISC-V GPGPU. Our method is based on trace observations and targets optimal hardware resource utilization to achieve superior performance and flexibility compared to hardware-agnostic mapping approaches. The technique was validated on different architectural GPU configurations across several OpenCL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
