Extending Biopython to combine multiple sequence alignments with the same reference into a Multiple Sequence Alignment
Cassia Bastress, Michiel de Hoon, Manuel Lera-Ramirez, Jürg Bähler

TL;DR
This paper introduces a new method in Biopython to merge multiple pairwise alignments into a single multiple sequence alignment when they share a common reference.
Contribution
A novel algorithm implemented in Biopython to combine pairwise alignments into an MSA while preserving their structure.
Findings
The algorithm successfully merges pairwise alignments with a common reference into a single MSA.
The implementation is available as a classmethod in Biopython’s Alignment class.
The method is useful for workflows involving circular plasmid sequence validation.
Abstract
Pairwise alignments (PWAs) are commonly used to compare sequences to a reference. Existing alignment tools provide algorithms to align multiple sequences to a single reference and to merge two sets of aligned sequences; but not to combine individually aligned PWAs with a common reference into a single MSA which preserves their original alignment structure. This is required for certain workflows. One example is aligning multiple sequencing traces with a circular plasmid sequence for validation. Some alignment tools that take into account the circularity of the plasmid sequence return a PWA per sequencing trace. For visualization, all PWAs have to be combined into a single MSA. For this purpose, we developed an algorithm that combines alignments sharing the same reference into an MSA, and implemented it as a classmethod in Biopython’s Alignment class.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Cell Image Analysis Techniques · Biomedical Text Mining and Ontologies
