SHRINK: Data Compression by Semantic Extraction and Residuals Encoding

Guoyou Sun; Panagiotis Karras; Qi Zhang

arXiv:2410.06713·cs.DC·February 25, 2025

SHRINK: Data Compression by Semantic Extraction and Residuals Encoding

Guoyou Sun, Panagiotis Karras, Qi Zhang

PDF

Open Access

TL;DR

SHRINK is a novel data compression method for IoT data that combines semantic extraction and residual encoding, achieving higher compression ratios and lower runtimes, especially at ultra-accurate levels.

Contribution

It introduces a dynamic, semantics-based compression approach that adapts to data characteristics and improves compression efficiency over existing methods.

Findings

01

Up to threefold improvement in compression ratio.

02

Higher compression ratio and lower runtime compared to prior methods.

03

Effective handling of diverse accuracy demands in IoT data compression.

Abstract

The distributed data infrastructure in Internet of Things (IoT) ecosystems requires efficient data-series compression methods, along with the ability to feed different accuracy demands. However, the compression performance of existing compression methods degrades sharply when calling for ultra-accurate data recovery. In this paper, we introduce SHRINK, a novel highly accurate data compression method that offers a higher compression ratio and also lower runtime than prior compressors. SHRINK extracts data semantics in the form of linear segments to construct a compact knowledge base, using a dynamic error threshold that it adapts to data characteristics. Then, it captures the remaining data details as residuals to support lossy compression at diverse resolutions as well as lossless compression. As SHRINK identifies repeated semantics, its compression ratio increases with data size. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression