Compressed Pattern-Matching with Ranked Variables in Zimin Words
Rados{\l}aw G{\l}owinski, Wojciech Rytter

TL;DR
This paper explores pattern matching in Zimin words, leveraging their compressibility to develop efficient algorithms for ranked pattern matching, including shortest instance and valuation counting.
Contribution
It introduces a novel approach to pattern matching in Zimin words using compression techniques, providing new algorithms for ranked matching problems.
Findings
Efficient algorithms for ranked pattern matching in Zimin words.
Methods to find shortest pattern instances and count valuations.
Enhanced understanding of Zimin words' structure and compressibility.
Abstract
Zimin words are very special finite words which are closely related to the pattern-avoidability problem. This problem consists in testing if an instance of a given pattern with variables occurs in almost all words over any finite alphabet. The problem is not well understood, no polynomial time algorithm is known and its NP-hardness is also not known. The pattern-avoidability problem is equivalent to searching for a pattern (with variables) in a Zimin word. The main difficulty is potentially exponential size of Zimin words. We use special properties of Zimin words, especially that they are highly compressible, to design efficient algorithms for special version of the pattern-matching, called here ranked matching. It gives a new interpretation of Zimin algorithm in compressed setting. We discuss the structure of rankings of variables and compressed representations of values of variables.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · semigroups and automata theory · Algorithms and Data Compression
