An Encoding for Order-Preserving Matching
Travis Gagie, Giovanni Manzini, Rossano Venturini

TL;DR
This paper introduces the first encoding data structure for order-preserving pattern matching, enabling efficient querying of pattern occurrences with minimal space, suitable for large datasets and short patterns.
Contribution
It presents an encoding structure that supports fast order-preserving pattern matching queries using significantly less space than storing the original data.
Findings
Space usage is within a constant factor of optimal.
Query time is optimal for certain alphabet sizes.
Supports pattern length up to logarithmic in data size.
Abstract
Encoding data structures store enough information to answer the queries they are meant to support but not enough to recover their underlying datasets. In this paper we give the first encoding data structure for the challenging problem of order-preserving pattern matching. This problem was introduced only a few years ago but has already attracted significant attention because of its applications in data analysis. Two strings are said to be an order-preserving match if the {\em relative order} of their characters is the same: e.g., and are an order-preserving match. We show how, given a string over an arbitrary alphabet and a constant , we can build an -bit encoding such that later, given a pattern with , we can return the number of order-preserving occurrences of in in time.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · DNA and Biological Computing
