Computing convolution on grammar-compressed text

Toshiya Tanaka; Tomohiro I; Shunsuke Inenaga; Hideo Bannai; Masayuki; Takeda

arXiv:1303.3945·cs.DS·March 19, 2013·2 cites

Computing convolution on grammar-compressed text

Toshiya Tanaka, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, Masayuki, Takeda

PDF

Open Access

TL;DR

This paper introduces an efficient algorithm for computing convolution between a grammar-compressed text and a pattern, significantly improving performance by leveraging the structure of the compression.

Contribution

It presents a novel $O(nm \,\log m)$-time algorithm and an improved $O(\min\{nm, N-\alpha\}) \log m$ algorithm for convolution on grammar-compressed texts, utilizing trie-based methods.

Findings

01

The $O(nm \log m)$ algorithm effectively computes convolution on SLPs.

02

The improved $O(\min\{nm, N-\alpha\}) \log m$ algorithm reduces complexity by exploiting redundancy.

03

Trie-based convolution computation enhances efficiency for compressed text processing.

Abstract

The convolution between a text string $S$ of length $N$ and a pattern string $P$ of length $m$ can be computed in $O (N lo g m)$ time by FFT. It is known that various types of approximate string matching problems are reducible to convolution. In this paper, we assume that the input text string is given in a compressed form, as a \emph{straight-line program (SLP)}, which is a context free grammar in the Chomsky normal form that derives a single string. Given an SLP $S$ of size $n$ describing a text $S$ of length $N$ , and an uncompressed pattern $P$ of length $m$ , we present a simple $O (nm lo g m)$ -time algorithm to compute the convolution between $S$ and $P$ . We then show that this can be improved to $O (min {nm, N - α} lo g m)$ time, where $α \geq 0$ is a value that represents the amount of redundancy that the SLP captures with respect to the length- $m$ substrings. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · Network Packet Processing and Optimization