IDEALEM: Statistical Similarity Based Data Reduction
Dongeun Lee, Alex Sim, Jaesik Choi, Kesheng Wu

TL;DR
IDEALEM introduces a novel lossy compression method based on statistical similarity that significantly reduces data volume while preserving key patterns, outperforming existing methods especially for non-stationary data.
Contribution
The paper presents a new statistical similarity-based compression technique, IDEALEM, with two operational modes and an enhanced min/max check for better pattern preservation and faster encoding.
Findings
IDEALEM achieves higher compression ratios than state-of-the-art methods.
The min/max check improves pattern preservation and encoding speed.
Spectral analysis confirms key frequency components are maintained.
Abstract
Many applications such as scientific simulation, sensing, and power grid monitoring tend to generate massive amounts of data, which should be compressed first prior to storage and transmission. These data, mostly comprised of floating-point values, are known to be difficult to compress using lossless compression. A few compression methods based on lossy compression have been proposed to compress this seemingly incompressible data. Unfortunately, they are all designed to minimize the Euclidean distance between the original data and the decompressed data, which fundamentally limits compression performance. We recently proposed a new class of lossy compression based on statistical similarity, called IDEALEM, which was also provided as a software package. IDEALEM has demonstrated its performance by reducing data volume much more than state-of-the-art compression methods while preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Data Compression Techniques · Advanced Data Storage Technologies
