Alignment of protein-coding sequences with frameshift extension   penalties

Fran\c{c}ois B\'elanger; A\"ida Ouangraoua

arXiv:1508.04783·cs.DS·August 21, 2015

Alignment of protein-coding sequences with frameshift extension penalties

Fran\c{c}ois B\'elanger, A\"ida Ouangraoua

PDF

Open Access

TL;DR

This paper presents a novel algorithm for aligning protein-coding sequences that incorporates frameshift extension penalties, allowing for more accurate modeling of frameshift events by considering variable codon substitution scores.

Contribution

The algorithm introduces a frameshift extension penalty and considers the full set of possible alignments without length constraints, improving upon previous methods.

Findings

01

Handles frameshift extensions with variable penalties

02

Maintains classical asymptotic complexity

03

Allows comprehensive alignment search space

Abstract

We introduce an algorithm for the alignment of protein- coding sequences accounting for frameshifts. The main specificity of this algorithm as compared to previously published protein-coding sequence alignment methods is the introduction of a penalty cost for frameshift ex- tensions. Previous algorithms have only used constant frameshift penal- ties. This is similar to the use of scoring schemes with affine gap penalties in classical sequence alignment algorithms. However, the overall penalty of a frameshift portion in an alignment cannot be formulated as an affine function, because it should also incorporate varying codon substitution scores. The second specificity of the algorithm is its search space being the set of all possible alignments between two coding sequences, under the classical definition of an alignment between two DNA sequences. Previous algorithms have introduced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · RNA and protein synthesis mechanisms · Machine Learning in Bioinformatics