Compression-Based Optimizations for Out-of-Core GPU Stencil Computation
Jingcheng Shen, Xin Deng, Yifan Wu, Masao Okita, Fumihiko Ino

TL;DR
This paper introduces compression-based techniques to optimize out-of-core GPU stencil computations, significantly reducing data transfer time and GPU memory usage, leading to improved performance on large datasets.
Contribution
The paper presents novel on-the-fly compression and buffer management methods specifically designed for out-of-core GPU stencil computations.
Findings
Achieved 1.1x speedup in stencil computation performance.
Reduced GPU memory consumption by 33%.
Demonstrated effectiveness on NVIDIA Tesla V100 GPU.
Abstract
An out-of-core stencil computation code handles large data whose size is beyond the capacity of GPU memory. Whereas, such an code requires streaming data to and from the GPU frequently. As a result, data movement between the CPU and GPU usually limits the performance. In this work, compression-based optimizations are proposed. First, an on-the-fly compression technique is applied to an out-of-core stencil code, reducing the CPU-GPU memory copy. Secondly, a single working buffer technique is used to reduce GPU memory consumption. Experimental results show that the stencil code using the proposed techniques achieved 1.1x speed and reduced GPU memory consumption by 33.0\% on an NVIDIA Tesla V100 GPU.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Algorithms and Data Compression · Advanced Data Storage Technologies
