Efficient Lyndon factorization of grammar compressed text

Tomohiro I; Yuto Nakashima; Shunsuke Inenaga; Hideo Bannai and; Masayuki Takeda

arXiv:1304.7061·cs.DS·April 29, 2013

Efficient Lyndon factorization of grammar compressed text

Tomohiro I, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai and, Masayuki Takeda

PDF

Open Access

TL;DR

This paper introduces a polynomial-time algorithm for computing Lyndon factorization directly from grammar compressed text represented as an SLP, enabling efficient processing of exponentially large strings.

Contribution

The paper presents the first polynomial-time algorithm for Lyndon factorization of grammar compressed strings, improving efficiency over previous methods.

Findings

01

Algorithm runs in $O(n^4 + mn^3h)$ time and $O(n^2)$ space.

02

Enables Lyndon factorization of exponentially large strings from SLPs.

03

First polynomial-time solution for Lyndon factorization on grammar compressed text.

Abstract

We present an algorithm for computing the Lyndon factorization of a string that is given in grammar compressed form, namely, a Straight Line Program (SLP). The algorithm runs in $O (n^{4} + m n^{3} h)$ time and $O (n^{2})$ space, where $m$ is the size of the Lyndon factorization, $n$ is the size of the SLP, and $h$ is the height of the derivation tree of the SLP. Since the length of the decompressed string can be exponentially large w.r.t. $n, m$ and $h$ , our result is the first polynomial time solution when the string is given as SLP.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · semigroups and automata theory