Fast genomic optical map assembly algorithm using binary representation
Przemys{\l}aw Stawczyk, Robert Nowak

TL;DR
This paper introduces a fast, binary-based algorithm for assembling optical genome maps without a reference, improving efficiency over traditional methods and suitable for error-free data in genome assembly pipelines.
Contribution
The paper presents a novel binary representation algorithm for optical map assembly that enhances speed and scalability compared to dynamic programming approaches.
Findings
Algorithm is faster than dynamic programming methods.
Performs well on low-error optical mapping data.
Available as open-source software for genome assembly applications.
Abstract
Reducing the cost of sequencing genomes provided by next-generation sequencing technologies has greatly increased the number of genomic projects. As a result, there is a growing need for better assembly and assembly validation methods. One promising idea is to use heterogeneous data in assembly projects. Optical Mapping (OM) is beneficial in validating genomic assemblies, correction and scaffolding. Single raw OM read describes a DNA molecule's long fragment, up to 1Mbp. Raw OM data from the same genome could be assembled to create consensus maps that span an entire chromosome. The assembly process is computationally hard because of the large number of errors in input data. This work describes a new algorithm and computer program to assemble OM reads without a reference genome. In our algorithm, we explored binary representation for genome maps. We focused on the efficiency of data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Chromosomal and Genetic Variations · Algorithms and Data Compression
