Multiple Sequence Alignment System for Pyrosequencing Reads
Fahad Saeed, Ashfaq Khokhar, Osvaldo Zagordi, Niko, Beerenwinkel

TL;DR
This paper introduces pyro-align, a fast and accurate multiple sequence alignment method tailored for error-prone pyrosequencing reads, addressing efficiency and accuracy issues in large-scale genome analysis.
Contribution
The paper presents a novel domain decomposition-based alignment algorithm specifically designed for pyrosequencing reads, improving speed and accuracy over existing methods.
Findings
Significantly faster alignment compared to existing methods.
Accurate alignment confirmed through consensus analysis.
Effective handling of error-prone reads in large datasets.
Abstract
Pyrosequencing is among the emerging sequencing techniques, capable of generating upto 100,000 overlapping reads in a single run. This technique is much faster and cheaper than the existing state of the art sequencing technique such as Sanger. However, the reads generated by pyrosequencing are short in size and contain numerous errors. Furthermore, each read has a specific position in the reference genome. In order to use these reads for any subsequent analysis, the reads must be aligned . Existing multiple sequence alignment methods cannot be used as they do not take into account the specific positions of the sequences with respect to the genome, and are highly inefficient for large number of sequences. Therefore, the common practice has been to use either simple pairwise alignment despite its poor accuracy for error prone pyroreads, or use computationally expensive techniques based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Microbial Community Ecology and Physiology · Protist diversity and phylogeny
