Designing optimal- and fast-on-average pattern matching algorithms
Gilles Didier, Laurent Tichit

TL;DR
This paper introduces a method to compute and achieve optimal average-case speeds for pattern matching algorithms, outperforming existing algorithms both theoretically and practically.
Contribution
It presents a general approach to determine the maximum expected speed of pattern matching algorithms and develops an algorithm to reach this speed, with a heuristic for longer patterns.
Findings
Proposed a method to compute the limit of expected speed over iid texts.
Developed an algorithm achieving the maximum possible speed within a class of algorithms.
Outperformed 9 existing pattern matching algorithms in both theory and practice.
Abstract
Given a pattern and a text , the speed of a pattern matching algorithm over with regard to , is the ratio of the length of to the number of text accesses performed to search into . We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to , over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with 9 pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Graph Theory and Algorithms · Machine Learning and Algorithms
