Testing Suffixient Sets
Davide Cenzato, Francisco Olivares, Nicola Prezza

TL;DR
This paper introduces suffixient sets, a new prefix array compression method that stores minimal entries to enable efficient pattern matching, and provides linear-time algorithms to identify and optimize such sets.
Contribution
It proposes suffixient sets as a novel PA compression technique and offers linear-time algorithms for recognizing and minimizing these sets.
Findings
Suffixient sets enable pattern matching with minimal PA storage.
Linear-time algorithms efficiently identify suffixient sets.
Optimal suffixient sets can be computed for given text positions.
Abstract
Suffixient sets are a novel prefix array (PA) compression technique based on subsampling PA (rather than compressing the entire array like previous techniques used to do): by storing very few entries of PA (in fact, a compressed number of entries), one can prove that pattern matching via binary search is still possible provided that random access is available on the text. In this paper, we tackle the problems of determining whether a given subset of text positions is (1) a suffixient set or (2) a suffixient set of minimum cardinality. We provide linear-time algorithms solving these problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · semigroups and automata theory
