Grammar Boosting: A New Technique for Proving Lower Bounds for   Computation over Compressed Data

Rajat De; Dominik Kempa

arXiv:2307.08833·cs.DS·September 24, 2024

Grammar Boosting: A New Technique for Proving Lower Bounds for Computation over Compressed Data

Rajat De, Dominik Kempa

PDF

Open Access

TL;DR

This paper introduces a novel, general technique for proving lower bounds on algorithms operating on grammar-compressed strings, applicable regardless of the compression ratio, and demonstrates its effectiveness through multiple concrete lower bounds.

Contribution

The authors develop the first general method for establishing lower bounds on grammar-compressed data structures that does not rely on the compression ratio.

Findings

01

Proves $oldsymbol{ ilde{ ext{O}}}( ext{log }N)$ lower bounds for random access on several grammar compressors.

02

Establishes lower bounds for CFG parsing conditioned on the $k$-Clique conjecture.

03

Matches existing upper bounds within space constraints.

Abstract

Grammar compression is a general compression framework in which a string $T$ of length $N$ is represented as a context-free grammar of size $n$ whose language contains only $T$ . In this paper, we focus on studying the limitations of algorithms and data structures operating on strings in grammar-compressed form. Previous work focused on proving lower bounds for grammars constructed using algorithms that achieve the approximation ratio $ρ = O (polylog N)$ . Unfortunately, for the majority of grammar compressors, $ρ$ is either unknown or satisfies $ρ = ω (polylog N)$ . In their seminal paper, Charikar et al. [IEEE Trans. Inf. Theory 2005] studied seven popular grammar compression algorithms: RePair, Greedy, LongestMatch, Sequential, Bisection, LZ78, and $α$ -Balanced. Only one of them ( $α$ -Balanced) is known to achieve $\rho=\mathcal{O}(\text{polylog…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · semigroups and automata theory · Natural Language Processing Techniques