Techniques for Shared Resource Management in Systems with Throughput Processors
Rachata Ausavarungnirun

TL;DR
This paper proposes new memory hierarchy management techniques for GPU-based systems to reduce interference and improve performance by making the memory system aware of application characteristics and modifying cache, memory, and address translation mechanisms.
Contribution
It introduces GPU-aware memory management mechanisms, including cache, memory scheduling, and address translation modifications, to mitigate interference in heterogeneous CPU-GPU systems.
Findings
Significant reduction in memory interference effects.
Improved GPU and system performance metrics.
Effective mitigation of intra- and inter-application interference.
Abstract
The continued growth of the computational capability of throughput processors has made throughput processors the platform of choice for a wide variety of high performance computing applications. Graphics Processing Units (GPUs) are a prime example of throughput processors that can deliver high performance for applications ranging from typical graphics applications to general-purpose data parallel (GPGPU) applications. However, this success has been accompanied by new performance bottlenecks throughout the memory hierarchy of GPU-based systems. We identify and eliminate performance bottlenecks caused by major sources of interference throughout the memory hierarchy. We introduce changes to the memory hierarchy for systems with GPUs that allow the memory hierarchy to be aware of both CPU and GPU applications' characteristics. We introduce mechanisms to dynamically analyze different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
