Cerulean: A hybrid assembly using high throughput short and long reads

Viraj Deshpande; Eric DK Fung; Son Pham; and Vineet Bafna

arXiv:1307.7933·q-bio.QM·July 31, 2013·WABI·1 cites

Cerulean: A hybrid assembly using high throughput short and long reads

Viraj Deshpande, Eric DK Fung, Son Pham, and Vineet Bafna

PDF

Open Access

TL;DR

Cerulean introduces a hybrid genome assembly method combining short and long reads that is computationally efficient, producing high-quality assemblies without extensive error correction, suitable for standard desktop use.

Contribution

It presents a novel hybrid assembly algorithm that avoids long read error correction, improving efficiency and quality over existing methods.

Findings

01

Achieves comparable assembly quality with less computational resources.

02

Operates efficiently on standard desktop hardware.

03

Produces high-quality assemblies for bacterial genomes.

Abstract

Genome assembly using high throughput data with short reads, arguably, remains an unresolvable task in repetitive genomes, since when the length of a repeat exceeds the read length, it becomes difficult to unambiguously connect the flanking regions. The emergence of third generation sequencing (Pacific Biosciences) with long reads enables the opportunity to resolve complicated repeats that could not be resolved by the short read data. However, these long reads have high error rate and it is an uphill task to assemble the genome without using additional high quality short reads. Recently, Koren et al. 2012 proposed an approach to use high quality short reads data to correct these long reads and, thus, make the assembly from long reads possible. However, due to the large size of both dataset (short and long reads), error-correction of these long reads requires excessively high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Microbial Community Ecology and Physiology · Legume Nitrogen Fixing Symbiosis