PaREM: A Novel Approach for Parallel Regular Expression Matching

Suejb Memeti; Sabri Pllana

arXiv:1412.1741·cs.FL·June 30, 2015

PaREM: A Novel Approach for Parallel Regular Expression Matching

Suejb Memeti, Sabri Pllana

PDF

TL;DR

This paper introduces PaREM, a parallel algorithm for regular expression matching using deterministic finite automata, significantly accelerating processing on shared-memory systems.

Contribution

It presents a novel parallel matching algorithm and a tool that automatically generates code for efficient shared-memory execution.

Findings

01

Achieved up to 21x speed-up with 48 threads.

02

Demonstrated effectiveness on shared-memory systems.

03

Compared favorably against traditional sequential algorithms.

Abstract

Regular expression matching is essential for many applications, such as finding patterns in text, exploring substrings in large DNA sequences, or lexical analysis. However, sequential regular expression matching may be time-prohibitive for large problem sizes. In this paper, we describe a novel algorithm for parallel regular expression matching via deterministic finite automata. Furthermore, we present our tool PaREM that accepts regular expressions and finite automata as input and automatically generates the corresponding code for our algorithm that is amenable for parallel execution on shared-memory systems. We evaluate our parallel algorithm empirically by comparing it with a commonly used algorithm for sequential regular expression matching. Experiments on a dual-socket shared-memory system with 24 physical cores show speed-ups of up to 21x for 48 threads.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.