Barnacle: An Assembly Algorithm for Clone-based Sequences of Whole Genomes
Vicky Choi, Martin Farach-Colton

TL;DR
Barnacle is a novel assembly algorithm for clone-based genome sequencing that improves conflict resolution and data inconsistency detection, demonstrated on the human genome draft.
Contribution
It introduces a new assembly method that abandons physical mapping, enhancing conflict resolution and data inconsistency detection in genome assembly.
Findings
More effective conflict resolution in repetitive regions
Detection of inconsistencies in underlying data
Comparison with NCBI's assembly (Build 28)
Abstract
We propose an assembly algorithm {\sc Barnacle} for sequences generated by the clone-based approach. We illustrate our approach by assembling the human genome. Our novel method abandons the original physical-mapping-first framework. As we show, {\sc Barnacle} more effectively resolves conflicts due to repeated sequences. The latter is the main difficulty of the sequence assembly problem. Inaddition, we are able to detect inconsistencies in the underlying data. We present and compare our results on the December 2001 freeze of the public working draft of the human genome with NCBI's assembly (Build 28). The assembly of December 2001 freeze of the public working draft generated by {\sc Barnacle} and the source code of {\sc Barnacle} are available at (http://www.cs.rutgers.edu/~vchoi).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · RNA and protein synthesis mechanisms
