A fast implementation of the good-suffix array for the Boyer-Moore string matching algorithm
Thierry Lecroq

TL;DR
This paper introduces a highly efficient implementation of the good-suffix array for the Boyer-Moore string matching algorithm, significantly improving computation speed through a detailed pattern analysis and outperforming existing methods in various tests.
Contribution
It presents a novel, faster method for computing the good-suffix array in Boyer-Moore, reducing redundant operations and enhancing overall string matching performance.
Findings
The new implementation is the fastest in most tested scenarios.
Experimental results demonstrate significant speed improvements.
The method simplifies the computation process for the good-suffix table.
Abstract
String matching is the problem of finding all the occurrences of a pattern in a text. It has been intensively studied and the Boyer-Moore string matching algorithm is probably one of the most famous solution to this problem. This algorithm uses two precomputed shift tables called the good-suffix table and the bad-character table. The good-suffix table is tricky to compute in linear time. Text book solutions perform redundant operations. Here we present a fast implementation for this good-suffix table based on a tight analysis of the pattern. Experimental results show two versions of this new implementation are the fastest in almost all tested situations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · DNA and Biological Computing
