Optimizing Smith-Waterman alignments

Rolf Olsen; Terence Hwa; and Michael Lassig

arXiv:cond-mat/9811225·cond-mat·May 23, 2007·Pacific Symposium on Biocomputing

Optimizing Smith-Waterman alignments

Rolf Olsen, Terence Hwa, and Michael Lassig

PDF

Open Access

TL;DR

This paper introduces a statistical framework for optimizing Smith-Waterman local sequence alignments, including a new fidelity measure that assesses alignment significance and can be optimized using minimal data.

Contribution

It presents a novel statistical analysis and a fidelity measure for local alignments, enabling parameter optimization based on single sequence pair data.

Findings

01

New fidelity measure effectively captures alignment significance

02

Optimization of penalty parameters improves alignment accuracy

03

The approach is validated through theoretical analysis and simulations

Abstract

Mutual correlation between segments of DNA or protein sequences can be detected by Smith-Waterman local alignments. We present a statistical analysis of alignment of such sequences, based on a recent scaling theory. A new fidelity measure is introduced and shown to capture the significance of the local alignment, i.e., the extent to which the correlated subsequences are correctly identified. It is demonstrated how the fidelity may be optimized in the space of penalty parameters using only the alignment score data of a single sequence pair.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · Stochastic processes and statistical mechanics