A Graph-based Model for GPU Caching Problems
Lingda Li, Ari B. Hayes, Stephen A. Hackler, Eddy Z. Zhang, Mario, Szegedy, Shuaiwen Leon Song

TL;DR
This paper introduces a new graph-based model for GPU caching that improves task scheduling accuracy and efficiency, leading to significant performance gains in GPU applications.
Contribution
A novel task partition model tailored for GPU caching that enhances scheduling accuracy and algorithm speed compared to previous methods.
Findings
Achieves significant performance improvements in GPU applications
Provides rigorous theoretical analysis of algorithm bounds
Demonstrates effectiveness through extensive experiments
Abstract
Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling among different threads. Traditionally, in the field of parallel computing, graph partition models are used to model data communication and guide task scheduling. However, we discover that the previous methods are either inaccurate or expensive when applied to GPU programs. In this paper, we propose a novel task partition model that is accurate and gives rise to the development of fast and high quality task/data reorganization algorithms. We demonstrate the effectiveness of the proposed model by rigorous theoretical analysis of the algorithm bounds and extensive experimental analysis. The experimental results show that it achieves significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Optimization and Search Problems · Graph Theory and Algorithms
