TL;DR
This paper presents a novel algorithm for aligning coding DNA sequences that accounts for both the initiation and length of frameshift translations, improving robustness and accuracy in detecting such events.
Contribution
The authors introduce a new scoring scheme and algorithm for pairwise CDS alignment that considers frameshift translation initiation and length, addressing limitations of previous methods.
Findings
Method is robust to parameter changes.
Performs well with and without frameshift translations.
Outperforms existing CDS alignment methods in tests.
Abstract
Frameshift translation is an important phenomenon that contributes to the appearance of novel Coding DNA Sequences (CDS) and functions in gene evolution, by allowing alternative amino acid translations of genes coding regions. Frameshift translations can be identified by aligning two CDS, from a same gene or from homologous genes, while accounting for their codon structure. Two main classes of algorithms have been proposed to solve the problem of aligning CDS, either by amino acid sequence alignment back-translation, or by simultaneously accounting for the nucleotide and amino acid levels. The former does not allow to account for frameshift translations and up to now, the latter exclusively accounts for frameshift translation initiation, not accounting for the length of the translation disruption caused by a frameshift. Here, we introduce a new scoring scheme with an algorithm for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
