Worst case efficient single and multiple string matching in the Word-RAM model
Djamal Belazzougui

TL;DR
This paper develops worst-case efficient data structures for single and multiple string matching in the word RAM model, achieving near-optimal query times with linear space, advancing theoretical bounds in pattern matching algorithms.
Contribution
It introduces new linear space data structures that approach the optimal query time for string matching problems in the word RAM model, including techniques to make query times independent of pattern length.
Findings
Single pattern matching query time: O(n(1/m+log sigma/w)+occ)
Multiple pattern matching query time: O(n((log d+log y+log log d)/y+log sigma/w)+occ)
Techniques to achieve pattern-length-independent query times using the four Russian method
Abstract
In this paper, we explore worst-case solutions for the problems of single and multiple matching on strings in the word RAM model with word length w. In the first problem, we have to build a data structure based on a pattern p of length m over an alphabet of size sigma such that we can answer to the following query: given a text T of length n, where each character is encoded using log(sigma) bits return the positions of all the occurrences of p in T (in the following we refer by occ to the number of reported occurrences). For the multi-pattern matching problem we have a set S of d patterns of total length m and a query on a text T consists in finding all positions of all occurrences in T of the patterns in S. As each character of the text is encoded using log sigma bits and we can read w bits in constant time in the RAM model, we assume that we can read up to (w/log sigma) consecutive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
