Optimal-Hash Exact String Matching Algorithms
Thierry Lecroq

TL;DR
This paper introduces optimized hash-based string matching algorithms that improve speed for short patterns on large alphabets by ensuring unique hash values for pattern q-grams.
Contribution
It presents a novel approach to select minimal q-gram lengths for hashing, enhancing the efficiency of existing string matching algorithms.
Findings
Faster matching for short patterns on large alphabets.
Unique hash values for pattern q-grams improve algorithm performance.
New algorithms outperform previous HASH family methods.
Abstract
String matching is the problem of finding all the occurrences of a pattern in a text. We propose improved versions of the fast family of string matching algorithms based on hashing -grams. The improvement consists of considering minimal values such that each -grams of the pattern has a unique hash value. The new algorithms are fastest than algorithm of the HASH family for short patterns on large size alphabets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · DNA and Biological Computing
