A Synergy between On- and Off-Chip Data Reuse for GPU-based Out-of-Core Stencil Computation
Jingcheng Shen, Linbo Long, Jun Zhang, Weiqi Shen, Masao Okita,, Fumihiko Ino

TL;DR
This paper introduces SO2DR, a novel approach that synergistically combines on- and off-chip data reuse to optimize GPU-based out-of-core stencil computations, significantly improving performance and reducing data transfer overhead.
Contribution
It proposes a new method, SO2DR, that simultaneously optimizes data transfer and kernel execution for out-of-core stencil codes on GPUs, filling a gap in existing research.
Findings
Achieves average speedup of 2.78x over traditional out-of-core methods.
Reduces CPU-GPU data transfer time significantly.
Enhances kernel execution performance in stencil computations.
Abstract
Stencil computation is an extensively-utilized class of scientific-computing applications that can be efficiently accelerated by graphics processing units (GPUs). Out-of-core approaches enable a GPU to handle large stencil codes whose data size is beyond the memory capacity of the GPU. However, current research on out-of-core stencil computation primarily focus on minimizing the amount of data transferred between the CPU and GPU. Few studies consider simultaneously optimizing data transfer and kernel execution. To fill the research gap, this work presents a synergy between on- and off-chip data reuse for out-of-core stencil codes, termed SO2DR. First, overlapping regions between data chunks are shared in the off-chip memory to eliminate redundant CPU-GPU data transfer. Secondly, redundant computation at the off-chip memory level is intentionally introduced to decouple kernel execution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Error Correcting Code Techniques
