Lossless preprocessing of floating point data to enhance compression

Francesco Taurone; Daniel E. Lucani; Marcell Feh\'er; Qi Zhang

arXiv:2308.03623·cs.DB·August 8, 2023

Lossless preprocessing of floating point data to enhance compression

Francesco Taurone, Daniel E. Lucani, Marcell Feh\'er, Qi Zhang

PDF

TL;DR

This paper introduces lossless preprocessing techniques tailored for floating point data that significantly improve compression rates by up to 40%, addressing the unique challenges of invertible transformations.

Contribution

It identifies key conditions for lossless transformations of floating point data and proposes four methods that enhance compression effectiveness.

Findings

01

Achieved up to 40% better compression rates

02

Identified conditions for lossless transformations

03

Proposed four new preprocessing methods

Abstract

Data compression algorithms typically rely on identifying repeated sequences of symbols from the original data to provide a compact representation of the same information, while maintaining the ability to recover the original data from the compressed sequence. Using data transformations prior to the compression process has the potential to enhance the compression capabilities, being lossless as long as the transformation is invertible. Floating point data presents unique challenges to generate invertible transformations with high compression potential. This paper identifies key conditions for basic operations of floating point data that guarantee lossless transformations. Then, we show four methods that make use of these observations to deliver lossless compression of real datasets, where we improve compression rates up to 40 %.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.