Linear-time computation of generalized minimal absent words for multiple strings
Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga, Hideo, Bannai

TL;DR
This paper introduces an efficient algorithm for computing generalized minimal absent words across multiple strings, achieving optimal time complexity for specific cases and extending previous single-string methods.
Contribution
It generalizes the concept of minimal absent words to multiple strings and provides optimal algorithms for their computation in various scenarios.
Findings
Optimal $O(n + | ext{M}|)$ time for two strings
Extended algorithm for multiple strings with $O(n ext{ceil}(k / ext{log} n) + | ext{M}|)$ time
Efficient use of $O(n (k + ext{log} n))$ bits of space in the word RAM model
Abstract
A string is called a minimal absent word (MAW) for a string if does not occur as a substring in and all proper substrings of occur in . MAWs are well-studied combinatorial string objects that have potential applications in areas including bioinformatics, musicology, and data compression. In this paper, we generalize the notion of MAWs to a set of multiple strings. We first describe our solution to the case of strings, and show how to compute the set of MAWs in optimal time and with working space, where denotes the total length of the strings in . We then move on to the general case of strings, and show how to compute the set of MAWs in time and with bits of working space, in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Natural Language Processing Techniques
