Bounds for Compression in Streaming Models
Travis Gagie (corresponding author)

TL;DR
This paper investigates the limits of data compression in various streaming models, establishing bounds on achievable compression and demonstrating methods to reach these bounds using specific transforms.
Contribution
It provides nearly tight bounds on memory-redundancy tradeoffs for streaming models and shows how to compute key transforms within these bounds.
Findings
Established bounds for compression in multiple streaming models.
Demonstrated computation of the Schindler Transform in StreamSort.
Achieved BWT computation in Read-Write model.
Abstract
Compression algorithms and streaming algorithms are both powerful tools for dealing with massive data sets, but many of the best compression algorithms -- e.g., those based on the Burrows-Wheeler Transform -- at first seem incompatible with streaming. In this paper we consider several popular streaming models and ask in which, if any, we can compress as well as we can with the BWT. We first prove a nearly tight tradeoff between memory and redundancy for the Standard, Multipass and W-Streams models, demonstrating a bound that is achievable with the BWT but unachievable in those models. We then show we can compute the related Schindler Transform in the StreamSort model and the BWT in the Read-Write model and, thus, achieve that bound.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Error Correcting Code Techniques · Computability, Logic, AI Algorithms
