CNCA aligns small annotated genomes
Jean-Noël Lorenzi, François Graner, Virginie Courtier-Orgogozo, Guillaume Achaz

TL;DR
CNCA is a new tool that aligns small genomes, including both coding and non-coding regions, to better study evolutionary relationships like those of SARS-CoV-2.
Contribution
CNCA is the first tool to align small genomes using both coding and non-coding sequences while preserving protein alignment and avoiding frameshifts.
Findings
CNCA aligns small genomes up to 50 kb, such as viruses, with conserved gene order.
The tool integrates coding and non-coding regions, preserving areas ignored in traditional methods.
CNCA ensures the final nucleotide alignment matches the protein alignment without frameshifts.
Abstract
To explore the evolutionary history of sequences, a sequence alignment is a first and necessary step, and its quality is crucial. In the context of the study of the proximal origins of SARS-CoV-2 coronavirus, we wanted to construct an alignment of genomes closely related to SARS-CoV-2 using both coding and non-coding sequences. To our knowledge, there is no tool that can be used to construct this type of alignment, which motivated the creation of CNCA. CNCA is a web tool that aligns annotated genomes from GenBank files. It generates a nucleotide alignment that is then updated based on the protein sequence alignment. The output final nucleotide alignment matches the protein alignment and guarantees no frameshift. CNCA was designed to align closely related small genome sequences up to 50 kb (typically viruses) for which the gene order is conserved. CNCA constructs multiple alignments of…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Bacteriophages and microbial interactions · RNA and protein synthesis mechanisms
