RasBhari: optimizing spaced seeds for database searching, read mapping and alignment-free sequence comparison
Lars Hahn, Chris-Andr\'e Leimeister, Rachid Ounit, Stefano Lonardi,, Burkhard Morgenstern

TL;DR
RasBhari is a tool that optimizes spaced seed patterns to improve the accuracy and sensitivity of sequence analysis tasks such as database searching, read mapping, and alignment-free comparison, by minimizing overlap complexity and variance.
Contribution
It introduces a modified hill-climbing algorithm, rasbhari, for optimizing pattern sets tailored to various sequence analysis applications, outperforming existing methods in sensitivity and accuracy.
Findings
RasBhari produces pattern sets with higher sensitivity for database searching.
Pattern sets from rasbhari improve phylogenetic distance estimates.
Enhanced sensitivity in short read classification with CLARK-S.
Abstract
Many algorithms for sequence analysis rely on word matching or word statistics. Often, these approaches can be improved if binary patterns representing match and don't-care positions are used as a filter, such that only those positions of words are considered that correspond to the match positions of the patterns. The performance of these approaches, however, depends on the underlying patterns. Herein, we show that the overlap complexity of a pattern set that was introduced by Ilie and Ilie is closely related to the variance of the number of matches between two evolutionarily related sequences with respect to this pattern set. We propose a modified hill-climbing algorithm to optimize pattern sets for database searching, read mapping and alignment-free sequence comparison of nucleic-acid sequences; our implementation of this algorithm is called rasbhari. Depending on the application at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
