Improving spliced alignment for identification of ortholog groups and multiple CDS alignment
Jean-David Aguilar, Safa Jammali, Esaie Kuitche, A\"ida Ouangraoua

TL;DR
This paper introduces a new constrained spliced alignment method and an extension to multiple spliced alignments, improving the identification of ortholog groups and multiple CDS alignments within gene families.
Contribution
It presents a novel constrained spliced alignment algorithm and the Multiple Spliced Alignment Problem (MSAP), enabling more accurate ortholog group detection and multiple CDS alignments.
Findings
High-coverage accurate alignments achieved
Effective clustering of CDS into ortholog and paralog groups
Implementation available in Python upon request
Abstract
The Spliced Alignment Problem (SAP) that consists in finding an optimal semi-global alignment of a spliced RNA sequence on an unspliced genomic sequence has been largely considered for the prediction and the annotation of gene structures in genomes. Here, we re-visit it for the purpose of identifying CDS ortholog groups within a set of CDS from homologous genes and for computing multiple CDS alignments. We introduce a new constrained version of the spliced alignment problem together with an algorithm that exploits full information on the exon-intron structure of the input RNA and gene sequences in order to compute high-coverage accurate alignments. We show how pairwise spliced alignments between the CDS and the gene sequences of a gene family can be directly used in order to clusterize the set of CDS of the gene family into a set of ortholog groups. We also introduce an extension of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · RNA and protein synthesis mechanisms · RNA modifications and cancer
