Time-Optimal Construction of String Synchronizing Sets
Jonas Ellert, Tomasz Kociumaka

TL;DR
This paper presents an optimal, time-efficient method for constructing string synchronizing sets, improving preprocessing times and supporting fast queries, which benefits various string processing applications.
Contribution
It introduces a new framework for constructing time-optimal string synchronizing sets with efficient query support, surpassing previous methods in speed and space complexity.
Findings
Preprocessing time reduced to O(n log σ / log n)
Construction of τ-synchronizing sets in optimal time O((n log τ)/(τ log n))
Supports constant-time select and near-constant rank queries
Abstract
A key principle in string processing is local consistency: using short contexts to handle matching fragments of a string consistently. String synchronizing sets [Kempa, Kociumaka; STOC 2019] are an influential instantiation of this principle. A -synchronizing set of a length- string is a set of positions, chosen via their length- contexts, such that (outside highly periodic regions) at least one position in every length- window is selected. Among their applications are faster algorithms for data compression, text indexing, and string similarity in the word RAM model. We show how to preprocess any string in time so that, for any , a -synchronizing set of can be constructed in time. Both bounds are optimal in the word RAM model with word size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Machine Learning and Algorithms
