SDRBench: Scientific Data Reduction Benchmark for Lossy Compressors

Kai Zhao; Sheng Di; Xin Liang; Sihuan Li; Dingwen Tao; Julie Bessac,; Zizhong Chen; Franck Cappello

arXiv:2101.03201·cs.DC·November 4, 2021

SDRBench: Scientific Data Reduction Benchmark for Lossy Compressors

Kai Zhao, Sheng Di, Xin Liang, Sihuan Li, Dingwen Tao, Julie Bessac,, Zizhong Chen, Franck Cappello

PDF

TL;DR

SDRBench is a comprehensive benchmark designed to evaluate and compare scientific data compressors across diverse datasets, aiding researchers in understanding compressor performance and quality.

Contribution

The paper introduces SDRBench, a standardized benchmark with diverse datasets and evaluation metrics for fair comparison of lossy and lossless scientific data compressors.

Findings

01

Evaluation of multiple compressors across datasets

02

Six key insights into lossy compressor performance

03

Guidelines for selecting suitable compression methods

Abstract

Efficient error-controlled lossy compressors are becoming critical to the success of today's large-scale scientific applications because of the ever-increasing volume of data produced by the applications. In the past decade, many lossless and lossy compressors have been developed with distinct design principles for different scientific datasets in largely diverse scientific domains. In order to support researchers and users assessing and comparing compressors in a fair and convenient way, we establish a standard compression assessment benchmark -- Scientific Data Reduction Benchmark (SDRBench). SDRBench contains a vast variety of real-world scientific datasets across different domains, summarizes several critical compression quality evaluation metrics, and integrates many state-of-the-art lossy and lossless compressors. We demonstrate evaluation results using SDRBench and summarize six…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.