ntLink: a toolkit for de novo genome assembly scaffolding and mapping using long reads
Lauren Coombe, Ren\'e L. Warren, Johnathan Wong, Vladimir Nikolic,, Inanc Birol

TL;DR
ntLink is a resource-efficient, minimizer-based genome scaffolding tool that improves assembly quality using long-read data, with new features like overlap detection and gap filling, applicable to various downstream genomic analyses.
Contribution
ntLink introduces a novel minimizer-based approach for genome scaffolding that enhances efficiency and accuracy, incorporating new features such as overlap detection and gap filling.
Findings
Achieves highly contiguous genome assemblies
Maintains computational efficiency across protocols
Enables downstream applications like misassembly detection
Abstract
With the increasing affordability and accessibility of genome sequencing data, de novo genome assembly is an important first step to a wide variety of downstream studies and analyses. Therefore, bioinformatics tools that enable the generation of high-quality genome assemblies in a computationally efficient manner are essential. Recent developments in long-read sequencing technologies have greatly benefited genome assembly work, including scaffolding, by providing long-range evidence that can aid in resolving the challenging repetitive regions of complex genomes. ntLink is a flexible and resource-efficient genome scaffolding tool that utilizes long-read sequencing data to improve upon draft genome assemblies built from any sequencing technologies, including the same long reads. Instead of using read alignments to identify candidate joins, ntLink utilizes minimizer-based mappings to infer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · RNA and protein synthesis mechanisms · Algorithms and Data Compression
