Algorithms for normalized multiple sequence alignments

Eloi Araujo; Luiz Rozante; Diego P. Rubert; Fabio V. Martinez

arXiv:2107.01607·cs.DS·December 6, 2021

Algorithms for normalized multiple sequence alignments

Eloi Araujo, Luiz Rozante, Diego P. Rubert, Fabio V. Martinez

PDF

TL;DR

This paper introduces the first normalized methods for multiple sequence alignment (MSA), defining new scoring criteria, proving NP-hardness, and providing exact and approximation algorithms for improved alignment accuracy.

Contribution

It develops the initial normalized MSA techniques, establishes their computational complexity, and offers algorithms for practical application.

Findings

01

Normalized MSA criteria are NP-hard to compute.

02

Exact algorithms are proposed for the new criteria.

03

Approximation algorithms are provided for certain scoring matrices.

Abstract

Sequence alignment supports numerous tasks in bioinformatics, natural language processing, pattern recognition, social sciences, and others fields. While the alignment of two sequences may be performed swiftly in many applications, the simultaneous alignment of multiple sequences proved to be naturally more intricate. Although most multiple sequence alignment (MSA) formulations are NP-hard, several approaches have been developed, as they can outperform pairwise alignment methods or are necessary for some applications. Taking into account not only similarities but also the lengths of the compared sequences (i.e. normalization) can provide better alignment results than both unnormalized or post-normalized approaches. While some normalized methods have been developed for pairwise sequence alignment, none have been proposed for MSA. This work is a first effort towards the development of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.