Counting Distinct Square Substrings in Sublinear Time
Panagiotis Charalampopoulos, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Tomasz Wale\'n, and Wiktor Zuba

TL;DR
This paper presents the first sublinear-time algorithm for counting distinct square substrings in packed strings, leveraging novel data structures and combinatorial properties to handle complex run structures efficiently.
Contribution
It introduces a sublinear-time algorithm for counting squares in packed strings, utilizing new representations of long-period runs and sparse-Lyndon roots.
Findings
Achieves $O(n/ ext{log}_\sigma n)$ time complexity for counting squares
Develops implicit representations of long-period runs in sublinear time
Introduces sparse-Lyndon roots for efficient Lyndon root computation
Abstract
We show that the number of distinct squares in a packed string of length over an alphabet of size can be computed in time in the word-RAM model. This paper is the first to introduce a sublinear-time algorithm for counting squares in the packed setting. The packed representation of a string of length over an alphabet of size is given as a sequence of machine words in the word-RAM model (a machine word consists of bits). Previously, it was known how to count distinct squares in time [Gusfield and Stoye, JCSS 2004], even for a string over an integer alphabet [Crochemore et al., TCS 2014; Bannai et al., CPM 2017; Charalampopoulos et al., SPIRE 2020]. We use the techniques for extracting squares from runs described by Crochemore et al. [TCS 2014]. However, the packed model requires novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
