SIMD Lossy Compression for Scientific Data
Griffin Dube, Jiannan Tian, Sheng Di, Dingwen Tao, Jon Calhoun, Franck, Cappello

TL;DR
This paper presents a SIMD vectorized CPU implementation of the SZ lossy compressor for scientific data, achieving significant speedups and improved compression quality by optimizing parallel prediction/quantization techniques.
Contribution
It introduces a SIMD CPU version of SZ with heuristics for block size and vector length, and explores padding strategies to enhance parallelism and reduce outliers.
Findings
Up to 32% better rate-distortion performance.
Up to 15x faster than SZ-1.4.
Padding reduces outliers by 100% in some cases.
Abstract
Modern HPC applications produce increasingly large amounts of data, which limits the performance of current extreme-scale systems. Data reduction techniques, such as lossy compression, help to mitigate this issue by decreasing the size of data generated by these applications. SZ, a current state-of-the-art lossy compressor, is able to achieve high compression ratios, but the prediction/quantization methods used introduce dependencies which prevent parallelizing this step of the compression. Recent work proposes a parallel dual prediction/quantization algorithm for GPUs which removes these dependencies. However, some HPC systems and applications do not use GPUs, and could still benefit from the fine-grained parallelism of this method. Using the dual-quantization technique, we implement and optimize a SIMD vectorized CPU version of SZ, and create a heuristic for selecting the optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
