ZipFlow: a Compiler-based Framework to Unleash Compressed Data Movement for Modern GPUs
Gwangoo Yeo, Zhiyang Shen, Wei Cui, Matteo Interlandi, Rathijit Sen, Bailu Ding, Qi Chen, Minsoo Rhu

TL;DR
ZipFlow is a compiler-based framework that optimizes compressed data transfer and decompression on GPUs, significantly improving data transfer performance in GPU-accelerated data analytics by classifying compression algorithms and exploiting GPU parallelism.
Contribution
It introduces a holistic, pattern-based approach to optimize compressed data movement on GPUs, advancing end-to-end query performance.
Findings
Achieves 2.08x speedup over nvCOMP
Attains 3.14x faster than CPU-based engines
Effectively exploits GPU parallelism across architectures
Abstract
In GPU-accelerated data analytics, the overhead of data transfer from CPU to GPU becomes a performance bottleneck when the data scales beyond GPU memory capacity due to the limited PCIe bandwidth. Data compression has come to rescue for reducing the amount of data transfer while taking advantage of the powerful GPU computation for decompression. To optimize the end-to-end query performance, however, the workflow of data compression, transfer, and decompression must be holistically designed based on the compression strategies and hardware characteristics to balance the I/O latency and computational overhead. In this work, we present ZipFlow, a compiler-based framework for optimizing compressed data transfer in GPU-accelerated data analytics. ZipFlow classifies compression algorithms into three distinct patterns based on their inherent parallelism. For each pattern, ZipFlow employs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Cloud Computing and Resource Management
