A fast implementation of the good-suffix array for the Boyer-Moore   string matching algorithm

Thierry Lecroq

arXiv:2402.16469·cs.DS·February 27, 2024·1 cites

A fast implementation of the good-suffix array for the Boyer-Moore string matching algorithm

Thierry Lecroq

PDF

Open Access

TL;DR

This paper introduces a highly efficient implementation of the good-suffix array for the Boyer-Moore string matching algorithm, significantly improving computation speed through a detailed pattern analysis and outperforming existing methods in various tests.

Contribution

It presents a novel, faster method for computing the good-suffix array in Boyer-Moore, reducing redundant operations and enhancing overall string matching performance.

Findings

01

The new implementation is the fastest in most tested scenarios.

02

Experimental results demonstrate significant speed improvements.

03

The method simplifies the computation process for the good-suffix table.

Abstract

String matching is the problem of finding all the occurrences of a pattern in a text. It has been intensively studied and the Boyer-Moore string matching algorithm is probably one of the most famous solution to this problem. This algorithm uses two precomputed shift tables called the good-suffix table and the bad-character table. The good-suffix table is tricky to compute in linear time. Text book solutions perform redundant operations. Here we present a fast implementation for this good-suffix table based on a tight analysis of the pattern. Experimental results show two versions of this new implementation are the fastest in almost all tested situations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · DNA and Biological Computing