Detecting regularities on grammar-compressed strings
Tomohiro I, Wataru Matsubara, Kouji Shimohira, Shunsuke Inenaga, Hideo, Bannai, Masayuki Takeda, Kazuyuki Narisawa, Ayumi Shinohara

TL;DR
This paper presents algorithms for detecting regularities such as runs, squares, gapped-palindromes, periods, and covers in strings represented by straight line programs, optimizing for compressed string analysis.
Contribution
It introduces efficient algorithms for detecting various regularities directly on grammar-compressed strings, improving analysis of compressed data.
Findings
All runs and squares can be computed in O(n^3h) time
Gapped-palindromes are computable in O(n^3h + gnh log N) time
String periods and covers can be found in O(n^2 h) and O(nh(n+log^2 N)) time respectively.
Abstract
We solve the problems of detecting and counting various forms of regularities in a string represented as a Straight Line Program (SLP). Given an SLP of size that represents a string of length , our algorithm compute all runs and squares in in time and space, where is the height of the derivation tree of the SLP. We also show an algorithm to compute all gapped-palindromes in time and space, where is the length of the gap. The key technique of the above solution also allows us to compute the periods and covers of the string in time and time, respectively.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · semigroups and automata theory
