Minimizing the Minimizers via Alphabet Reordering
Hilde Verbeek, Lorraine A.K. Ayad, Grigorios Loukides, Solon, P. Pissis

TL;DR
This paper investigates the problem of optimizing alphabet orderings to minimize the number of minimizers in string sampling, proving it to be NP-hard and explaining the difficulty of finding exact solutions.
Contribution
It introduces the problem of alphabet reordering for minimizer sampling and proves its NP-hardness, providing theoretical insight into the challenge of minimizing minimizers.
Findings
Proves the NP-hardness of alphabet reordering for minimizer minimization.
Provides theoretical justification for the lack of exact algorithms.
Highlights the complexity of optimizing minimizer sampling in bioinformatics.
Abstract
Minimizers sampling is one of the most widely-used mechanisms for sampling strings [Roberts et al., Bioinformatics 2004]. Let be a string over a totally ordered alphabet . Further let and be two integers. The minimizer of is the smallest position in where the lexicographically smallest length- substring of starts. The set of minimizers over all is the set of the minimizers of . We consider the following basic problem: Given , , and , can we efficiently compute a total order on that minimizes ? We show that this is unlikely by proving that the problem is NP-hard for any and . Our result provides theoretical justification as to why there exist no exact algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
