# The smallest grammar problem revisited

**Authors:** Hideo Bannai, Momoko Hirayama, Danny Hucke, Shunsuke Inenaga, Artur, Jez, Markus Lohrey, Carl Philipp Reh

arXiv: 1908.06428 · 2019-08-20

## TL;DR

This paper refines bounds on approximation ratios for several grammar-based compression algorithms, closing previous gaps and improving lower bounds, thus advancing theoretical understanding of their efficiency.

## Contribution

It establishes precise asymptotic bounds for LZ78, BISECTION, and RePair algorithms, improving and closing gaps in the theoretical analysis of their approximation ratios.

## Key findings

- LZ78 approximation ratio is Θ((n/log n)^{2/3})
- BISECTION approximation ratio is Θ(√(n/log n))
- RePair lower bound improved to Ω(log n / log log n)

## Abstract

In a seminal paper of Charikar et al. on the smallest grammar problem, the authors derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here the gaps for $\mathsf{LZ78}$ and $\mathsf{BISECTION}$ are closed by showing that the approximation ratio of $\mathsf{LZ78}$ is $\Theta( (n/\log n)^{2/3})$, whereas the approximation ratio of $\mathsf{BISECTION}$ is $\Theta(\sqrt{n/\log n})$. In addition, the lower bound for $\mathsf{RePair}$ is improved from $\Omega(\sqrt{\log n})$ to $\Omega(\log n/\log\log n)$. Finally, results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets are improved.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.06428/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1908.06428/full.md

---
Source: https://tomesphere.com/paper/1908.06428