Counting distinct (non-)crossing substrings
Haruki Umezaki, Hiroki Shibata, Dominik K\"oppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

TL;DR
This paper introduces efficient algorithms to count distinct crossing and non-crossing substrings in a string for all positions, extending previous solutions from constant alphabets to general and linearly sortable alphabets.
Contribution
It presents new algorithms that compute the counts in linear total time for all positions in the string, improving over previous methods limited to constant alphabets.
Findings
Algorithms run in O(n) total time for all positions.
Extends previous solutions to general ordered alphabets.
Provides efficient counting of crossing and non-crossing substrings.
Abstract
Let be a string of length . The problem of counting factors crossing a position - Problem 64 from the textbook ``125 Problems in Text Algorithms'' [Crochemore, Leqroc, and Rytter, 2021], asks to count the number (resp. ) of distinct substrings in that have occurrences containing (resp. not containing) a position in . The solutions provided in their textbook compute and in time for a single position in , and thus a direct application would require time for all positions in . Their solution is designed for constant-size alphabets. In this paper, we present new algorithms which compute in total time for general ordered alphabets, and in total time for linearly sortable alphabets, for all positions $k =…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenome Rearrangement Algorithms · Algorithms and Data Compression · Advanced Combinatorial Mathematics
