GWLZ: A Group-wise Learning-based Lossy Compression Framework for Scientific Data
Wenqi Jia, Sian Jin, Jinzhen Wang, Wei Niu, Dingwen Tao, Miao Yin

TL;DR
GWLZ is a novel deep learning-based lossy compression framework that significantly improves data reconstruction quality for scientific datasets with minimal overhead, addressing the limitations of traditional methods in managing exascale data.
Contribution
GWLZ introduces a group-wise learning approach with lightweight neural enhancers to enhance lossy compression quality for scientific data.
Findings
Achieves up to 20% quality improvement in data reconstruction.
Maintains negligible compression overhead of 0.0003x.
Demonstrates effectiveness on diverse scientific datasets.
Abstract
The rapid expansion of computational capabilities and the ever-growing scale of modern HPC systems present formidable challenges in managing exascale scientific data. Faced with such vast datasets, traditional lossless compression techniques prove insufficient in reducing data size to a manageable level while preserving all information intact. In response, researchers have turned to error-bounded lossy compression methods, which offer a balance between data size reduction and information retention. However, despite their utility, these compressors employing conventional techniques struggle with limited reconstruction quality. To address this issue, we draw inspiration from recent advancements in deep learning and propose GWLZ, a novel group-wise learning-based lossy compression framework with multiple lightweight learnable enhancer models. Leveraging a group of neural networks, GWLZ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Mining Algorithms and Applications · Scientific Computing and Data Management
