Bicriteria data compression
Andrea Farruggia, Paolo Ferragina, Antonio Frangioni, Rossano, Venturini

TL;DR
This paper introduces a formal, optimal approach to bicriteria data compression using LZ77, balancing compression size and decompression speed, and demonstrates its effectiveness over existing heuristics.
Contribution
It formalizes the bicriteria LZ77-Parsing problem, providing an efficient algorithm with provable optimality and practical advantages over heuristic methods.
Findings
Algorithm runs in O(n log^2 n) time
Achieves optimal trade-offs between space and decompression time
Outperforms existing heuristic compression methods
Abstract
The advent of massive datasets (and the consequent design of high-performing distributed storage systems) have reignited the interest of the scientific and engineering community towards the design of lossless data compressors which achieve effective compression ratio and very efficient decompression speed. Lempel-Ziv's LZ77 algorithm is the de facto choice in this scenario because of its decompression speed and its flexibility in trading decompression speed versus compressed-space efficiency. Each of the existing implementations offers a trade-off between space occupancy and decompression speed, so software engineers have to content themselves by picking the one which comes closer to the requirements of the application in their hands. Starting from these premises, and for the first time in the literature, we address in this paper the problem of trading optimally, and in a principled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Complexity and Algorithms in Graphs · Parallel Computing and Optimization Techniques
