Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences
Heng Li

TL;DR
This paper introduces minimap and miniasm, tools that enable rapid mapping and de novo assembly of noisy long reads from SMRT and ONT sequencing technologies without error correction, significantly speeding up genome assembly.
Contribution
The paper presents novel tools, minimap and miniasm, that perform fast mapping and assembly of long, noisy reads without the need for error correction, improving efficiency.
Findings
Assembles bacterial genomes into a single contig in minutes.
Assembles C. elegans genome in 9 minutes, much faster than existing methods.
Introduces interoperable formats PAF and GFA.
Abstract
Motivation: Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10kbp in length, which have enabled high-quality genome assembly at an affordable cost. However, at present, long reads have an error rate as high as 10-15%. Complex and computationally intensive pipelines are required to assemble such reads. Results: We present a new mapper, minimap, and a de novo assembler, miniasm, for efficiently mapping and assembling SMRT and ONT reads without an error correction stage. They can often assemble a sequencing run of bacterial data into a single contig in a few minutes, and assemble 45-fold C. elegans data in 9 minutes, orders of magnitude faster than the existing pipelines. We also introduce a pairwise read mapping format (PAF) and a graphical fragment assembly format (GFA), and demonstrate the interoperability between ours…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
