Minimum error correction-based haplotype assembly: considerations for   long read data

Sina Majidian; Mohammad Hossein Kahaei; Dick de Ridder

arXiv:1803.05019·q-bio.GN·June 19, 2020

Minimum error correction-based haplotype assembly: considerations for long read data

Sina Majidian, Mohammad Hossein Kahaei, Dick de Ridder

PDF

1 Repo

TL;DR

This paper critically examines the MEC-based haplotype assembly method, revealing its limitations with error-prone long reads from various sequencing devices and suggesting coverage requirements to improve accuracy.

Contribution

It demonstrates that MEC can produce incorrect haplotypes with long read data and provides coverage guidelines to mitigate this issue.

Findings

01

MEC may lead to incorrect haplotypes with long reads.

02

Coverage of 25 is recommended for Pacific BioSciences RS data.

03

MEC performance varies with error rates and coverage levels.

Abstract

The single nucleotide polymorphism (SNP) is the most widely studied type of genetic variation. A haplotype is defined as the sequence of alleles at SNP sites on each haploid chromosome. Haplotype information is essential in unravelling the genome-phenotype association. Haplotype assembly is a well-known approach for reconstructing haplotypes, exploiting reads generated by DNA sequencing devices. The Minimum Error Correction (MEC) metric is often used for reconstruction of haplotypes from reads. However, problems with the MEC metric have been reported. Here, we investigate the MEC approach to demonstrate that it may result in incorrectly reconstructed haplotypes for devices that produce error-prone long reads. Specifically, we evaluate this approach for devices developed by Illumina, Pacific BioSciences and Oxford Nanopore Technologies. We show that imprecise haplotypes may be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

smajidian/MEC
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.