Accelerating Parallel Write via Deeply Integrating Predictive Lossy Compression with HDF5
Sian Jin, Dingwen Tao, Houjun Tang, Sheng Di, Suren Byna, Zarija, Lukic, Franck Cappello

TL;DR
This paper introduces a method to accelerate parallel data writing in HPC by integrating predictive lossy compression with HDF5, enabling overlapping of compression and write operations, resulting in significant performance gains.
Contribution
It presents a novel deep integration of predictive lossy compression with HDF5, including analytical models and task reordering for improved parallel write performance.
Findings
Up to 4.5x faster write performance with compression.
Achieved only 1.5% additional storage overhead.
Effective on large-scale HPC systems with thousands of cores.
Abstract
Lossy compression is one of the most efficient solutions to reduce storage overhead and improve I/O performance for HPC applications. However, existing parallel I/O libraries cannot fully utilize lossy compression to accelerate parallel write due to the lack of deep understanding on compression-write performance. To this end, we propose to deeply integrate predictive lossy compression with HDF5 to significantly improve the parallel-write performance. Specifically, we propose analytical models to predict the time of compression and parallel write before the actual compression to enable compression-write overlapping. We also introduce an extra space in the process to handle possible data overflows resulting from prediction uncertainty in compression ratios. Moreover, we propose an optimization to reorder the compression tasks to increase the overlapping efficiency. Experiments with up to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
