RawHash2: Mapping Raw Nanopore Signals Using Hash-Based Seeding and Adaptive Quantization
Can Firtina, Melina Soysal, Jo\"el Lindegger, Onur Mutlu

TL;DR
RawHash2 advances real-time nanopore signal analysis by improving hash-based mapping accuracy and throughput through enhanced quantization, filtering, and support for new data formats and flow cells.
Contribution
RawHash2 introduces significant improvements over RawHash, including enhanced sensitivity, weighted mapping, frequency filtering, minimizers, and support for R10.4 flow cells and multiple data formats.
Findings
10.57% average F1 accuracy improvement
4.0x average throughput increase
Supports R10.4 flow cell and various data formats
Abstract
Summary: Raw nanopore signals can be analyzed while they are being generated, a process known as real-time analysis. Real-time analysis of raw signals is essential to utilize the unique features that nanopore sequencing provides, enabling the early stopping of the sequencing of a read or the entire sequencing run based on the analysis. The state-of-the-art mechanism, RawHash, offers the first hash-based efficient and accurate similarity identification between raw signals and a reference genome by quickly matching their hash values. In this work, we introduce RawHash2, which provides major improvements over RawHash, including a more sensitive quantization and chaining implementation, weighted mapping decisions, frequency filters to reduce ambiguous seed hits, minimizers for hash-based sketching, and support for the R10.4 flow cell version and various data formats such as POD5 and SLOW5.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNanopore and Nanochannel Transport Studies · Genomics and Phylogenetic Studies · Machine Learning in Bioinformatics
MethodsEarly Stopping
