Estimating seed sensitivity on homogeneous alignments
Gregory Kucherov (LIFL), Laurent Noe (LIFL), Yann Ponty (LRI)

TL;DR
This paper introduces methods for accurately estimating seed sensitivity in similarity search algorithms by focusing on homogeneous alignments, offering algorithms for counting, generating, and exact sensitivity computation, and highlighting biases from previous models.
Contribution
It presents novel algorithms for counting, generating, and exactly computing seed sensitivity based on homogeneous alignments, improving over Markov model approaches.
Findings
Homogeneous alignments significantly impact sensitivity estimates.
The proposed algorithms enable precise sensitivity calculations.
Ignoring homogeneousness introduces bias in sensitivity estimation.
Abstract
We address the problem of estimating the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models [18, 6, 3, 4, 10], we study the estimation based on homogeneous alignments. We describe an algorithm for counting and random generation of those alignments and an algorithm for exact computation of the sensitivity for a broad class of seed strategies. We provide experimental results demonstrating a bias introduced by ignoring the homogeneousness condition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
