Algorithms for Jumbled Pattern Matching in Strings
P\'eter Burcsi, Ferdinando Cicalese, Gabriele Fici, Zsuzsanna, Lipt\'ak

TL;DR
This paper introduces two new algorithms for jumbled pattern matching in strings, enabling efficient search for character-multiset patterns using linear indices, with one achieving constant-time decision and the other sub-linear expected time for finding all occurrences.
Contribution
It presents novel algorithms for Parikh vector pattern matching, including a constant-time decision algorithm for binary texts and a sub-linear expected time algorithm for general alphabets, both using linear indices.
Findings
Constant-time decision algorithm for binary texts.
Sub-linear expected time algorithm for general alphabets.
Linear size index enables efficient pattern matching.
Abstract
The Parikh vector p(s) of a string s is defined as the vector of multiplicities of the characters. Parikh vector q occurs in s if s has a substring t with p(t)=q. We present two novel algorithms for searching for a query q in a text s. One solves the decision problem over a binary text in constant time, using a linear size index of the text. The second algorithm, for a general finite alphabet, finds all occurrences of a given Parikh vector q and has sub-linear expected time complexity; we present two variants, which both use a linear size index of the text.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
