Control Flow Management in Modern GPUs
Mojtaba Abaie Shoushtary, Jordi Tubella Murgadas, Antonio Gonzalez

TL;DR
This paper introduces Hanoi, a low-cost control flow management mechanism for GPUs that closely mimics real hardware, enabling better performance modeling and software control despite vendor opacity.
Contribution
It defines a semantic for control flow instructions in the Turing ISA and proposes Hanoi, a mechanism that accurately replicates hardware control flow with minimal discrepancy.
Findings
Discrepancy between real hardware and Hanoi is only 1.03%
Hanoi's IPC difference from real hardware is just 0.19%
Hanoi provides a cost-effective way to model GPU control flow
Abstract
In GPUs, the control flow management mechanism determines which threads in a warp are active at any point in time. This mechanism monitors the control flow of scalar threads within a warp to optimize thread scheduling and plays a critical role in the utilization of execution resources. The control flow management mechanism can be controlled or assisted by software through instructions. However, GPU vendors do not disclose details about their compiler, ISA, or hardware implementations. This lack of transparency makes it challenging for researchers to understand how the control flow management mechanism functions, is implemented, or is assisted by software, which is crucial when it significantly affects their research. It is also problematic for performance modeling of GPUs, as one can only rely on traces from real hardware for control flow and cannot model or modify the functionality of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
