Estimating the Gumbel scale parameter for local alignment of random sequences by importance sampling with stopping times
Yonil Park, Sergey Sheetlin, John L. Spouge

TL;DR
This paper introduces a novel equation for estimating the Gumbel distribution's scale parameter in local sequence alignment, enabling faster and more sensitive biological database searches through importance sampling techniques.
Contribution
The paper presents a new equation for the Gumbel scale parameter and an efficient importance sampling algorithm for local alignment of random sequences.
Findings
The new equation is potentially exact based on numerical evidence.
The importance sampling method improves estimation efficiency.
Simulations demonstrate accurate parameter estimation across scoring schemes.
Abstract
The gapped local alignment score of two random sequences follows a Gumbel distribution. If computers could estimate the parameters of the Gumbel distribution within one second, the use of arbitrary alignment scoring schemes could increase the sensitivity of searching biological sequence databases over the web. Accordingly, this article gives a novel equation for the scale parameter of the relevant Gumbel distribution. We speculate that the equation is exact, although present numerical evidence is limited. The equation involves ascending ladder variates in the global alignment of random sequences. In global alignment simulations, the ladder variates yield stopping times specifying random sequence lengths. Because of the random lengths, and because our trial distribution for importance sampling occurs on a different sample space from our target distribution, our study led to a mapping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
