On the Benefit of Merging Suffix Array Intervals for Parallel Pattern   Matching

Johannes Fischer; Dominik K\"oppl; Florian Kurpicz

arXiv:1606.02465·cs.DS·June 9, 2016

On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

Johannes Fischer, Dominik K\"oppl, Florian Kurpicz

PDF

TL;DR

This paper introduces parallel algorithms for exact and approximate pattern matching using suffix arrays, focusing on efficient interval merging with a novel data structure, enabling faster processing on parallel architectures.

Contribution

It presents a new data structure for quickly merging suffix array intervals and parallel algorithms for pattern matching, improving efficiency over previous methods.

Findings

01

Parallel suffix array interval computation in sub-logarithmic time.

02

Efficient merging of suffix array intervals in constant or near-constant parallel time.

03

Algorithms handle approximate matching with up to k differences efficiently.

Abstract

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with $p$ processors. Given a static text of length $n$ , we first show how to compute the suffix array interval of a given pattern of length $m$ in $O (\frac{m}{p} + l g p + l g l g p \cdot l g l g n)$ time for $p \leq m$ . For approximate pattern matching with $k$ differences or mismatches, we show how to compute all occurrences of a given pattern in $O (\frac{m ^{k} σ ^{k}}{p} max (k, l g l g n) + (1 + \frac{m}{p}) l g p \cdot l g l g n + occ)$ time, where $σ$ is the size of the alphabet and $p \leq σ^{k} m^{k}$ . The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns $P$ and $P^{'}$ , we present a data structure for computing the interval of $P P^{'}$ in $O (l g l g n)$ sequential time, or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.