Faster fully compressed pattern matching by recompression

Artur Je\.z

arXiv:1111.3244·cs.DS·June 26, 2013

Faster fully compressed pattern matching by recompression

Artur Je\.z

PDF

Open Access

TL;DR

This paper introduces a faster algorithm for fully compressed pattern matching using recompression techniques, significantly improving over previous methods by reducing the complexity to nearly linear in the size of compressed inputs.

Contribution

The paper presents a novel recompression-based algorithm for fully compressed pattern matching with improved time complexity.

Findings

01

Achieves O((n+m)log M) runtime for compressed pattern matching.

02

Outperforms the previous O(n^2m) algorithm by Lifshits.

03

Provides an efficient method for handling compressed strings in pattern matching.

Abstract

In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammars generating exactly one string; the term fully means that both the pattern and the text are given in the compressed form. The problem is approached using a recently developed technique of local recompression: the SLPs are refactored, so that substrings of the pattern and text are encoded in both SLPs in the same way. To this end, the SLPs are locally decompressed and then recompressed in a uniform way. This technique yields an O((n+m)log M) algorithm for compressed pattern matching, assuming that M fits in O(1) machine words, where n (m) is the size of the compressed representation of the text (pattern, respectively), while M is the size of the decompressed pattern. If only m+n fits in O(1) machine words, the running time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · semigroups and automata theory · Network Packet Processing and Optimization