TL;DR
This paper introduces a fast, space-efficient method for constructing small AVL grammars from LZ77 parsing, enabling quicker and more practical grammar-based compression and data structure construction.
Contribution
The authors present a novel algorithm that produces smaller AVL grammars from LZ77 parses with reduced memory usage, improving practicality and speed over previous methods.
Findings
Grammars are at least five times smaller than original algorithms.
The method can produce run-length BWT from LZ77 in about one third of the time.
Peak RAM usage is significantly reduced.
Abstract
Grammar compression is, next to Lempel-Ziv (LZ77) and run-length Burrows-Wheeler transform (RLBWT), one of the most flexible approaches to representing and processing highly compressible strings. The main idea is to represent a text as a context-free grammar whose language is precisely the input string. This is called a straight-line grammar (SLG). An AVL grammar, proposed by Rytter [Theor. Comput. Sci., 2003] is a type of SLG that additionally satisfies the AVL-property: the heights of parse-trees for children of every nonterminal differ by at most one. In contrast to other SLG constructions, AVL grammars can be constructed from the LZ77 parsing in compressed time: where is the size of the LZ77 parsing and is the length of the input text. Despite these advantages, AVL grammars are thought to be too large to be practical. We present a new technique for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
