Enabling Homomorphic Analytical Operations on Compressed Scientific Data with Multi-stage Decompression
Xuan Wu, Sheng Di, Tripti Agarwal, Kai Zhao, Xin Liang, Franck Cappello

TL;DR
This paper introduces a multi-stage decompression framework with homomorphic algorithms that enables direct analytical operations on compressed scientific data, significantly reducing data access latency and improving efficiency.
Contribution
It presents a novel multi-stage decompression and homomorphic analysis framework for compressed scientific data, with implementations based on state-of-the-art compressors and demonstrated speedups.
Findings
Achieves significant speedups in analytical operations on compressed data
Supports multiple types of analytical operations directly on intermediate decompressed data
Validated on five real-world scientific datasets
Abstract
Error-controlled lossy compressors have been widely used in scientific applications to reduce the unprecedented size of scientific data while keeping data distortion within a user-specified threshold. While they significantly mitigate the pressure for data storage and transmission, they prolong the time to access the data because decompression is required to transform the binary compressed data into meaningful floating-point numbers. This incurs noticeable overhead for common analytical operations on scientific data that extract or derive useful information, because the time cost of the operations could be much lower than that of decompression. In this work, we design an error-controlled lossy compression and analytical framework that features multi-stage decompression and homomorphic analytical operation algorithms on intermediate decompressed data for reduced data access latency. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Advanced Database Systems and Queries · Cloud Computing and Resource Management
