Accurate selfcorrection of errors in long reads using de Bruijn graphs

Leena Salmela; Riku Walve; Eric Rivals; Esko Ukkonen

arXiv:1604.02233·q-bio.GN·November 18, 2021·Bioinform.

Accurate selfcorrection of errors in long reads using de Bruijn graphs

Leena Salmela, Riku Walve, Eric Rivals, Esko Ukkonen

PDF

Open Access

TL;DR

This paper introduces LoRMA, a novel long-read error correction method that uses de Bruijn graphs and multiple alignments, achieving high accuracy and increased throughput without relying on short reads.

Contribution

LoRMA is the first long-read-only correction method combining de Bruijn graphs and multiple alignments for improved accuracy and efficiency.

Findings

01

Most accurate long-read-only correction at high coverage

02

At least 20% higher throughput at 75x coverage

03

Effective for de novo genome assembly applications

Abstract

New long read sequencing technologies, like PacBio SMRT and Oxford NanoPore, can produce sequencing reads up to 50,000 bp long but with an error rate of at least 15%. Reducing the error rate is necessary for subsequent utilisation of the reads in, e.g., de novo genome assembly. The error correction problem has been tackled either by aligning the long reads against each other or by a hybrid approach that uses the more accurate short reads produced by second generation sequencing technologies to correct the long reads. We present an error correction method that uses long reads only. The method consists of two phases: first we use an iterative alignment-free correction method based on de Bruijn graphs with increasing length of k-mers, and second, the corrected reads are further polished using long-distance dependencies that are found using multiple alignments. According to our experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Molecular Biology Techniques and Applications · Chromosomal and Genetic Variations