Grammar-Based Compression in a Streaming Model
Travis Gagie, Pawel Gawrychowski

TL;DR
This paper demonstrates that with limited memory and multiple passes over multiple streams, it is possible to efficiently generate a near-optimal context-free grammar for a string, improving upon previous limitations.
Contribution
It introduces a streaming model for grammar-based compression that achieves near-optimal grammar size with constant memory and logarithmic passes over multiple streams.
Findings
Achieves grammar size within an { ext{min}(g \, ext{log} g, \, ext{sqrt}(n \, ext{log} g))} of the optimal
Contrasts with previous results showing limitations with polylogarithmic memory and single stream
Shows feasibility of efficient grammar-based compression in streaming models
Abstract
We show that, given a string of length , with constant memory and logarithmic passes over a constant number of streams we can build a context-free grammar that generates and only and whose size is within an -factor of the minimum . This stands in contrast to our previous result that, with polylogarithmic memory and polylogarithmic passes over a single stream, we cannot build such a grammar whose size is within any polynomial of .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · semigroups and automata theory · Machine Learning and Algorithms
