RawAlign: Accurate, Fast, and Scalable Raw Nanopore Signal Mapping via Combining Seeding and Alignment
Jo\"el Lindegger, Can Firtina, Nika Mansouri Ghiasi, Mohammad, Sadrosadati, Mohammed Alser, Onur Mutlu

TL;DR
RawAlign is a novel method that combines seeding and alignment to achieve accurate, fast, and scalable mapping of raw nanopore signals, especially for large genomes, outperforming existing tools in accuracy and efficiency.
Contribution
RawAlign introduces the first integration of fine-grained signal alignment into raw nanopore signal mapping, with algorithmic improvements and hardware acceleration for enhanced performance.
Findings
Achieves the most accurate mapping for large genomes.
Performs comparably to RawHash, with slight variations.
Outperforms UNCALLED and Sigmap by significant margins.
Abstract
Nanopore sequencers generate raw electrical signals representing the contents of a biological sequence molecule passing through the nanopore. These signals can be analyzed directly, avoiding basecalling entirely. We observe that while existing proposals for raw signal analysis typically do well in all metrics for small genomes (e.g., viral genomes), they all perform poorly for large genomes (e.g., the human genome). Our goal is to analyze raw nanopore signals in an accurate, fast, and scalable manner. To this end, we propose RawAlign, the first work to integrate fine-grained signal alignment into the state-of-the-art raw signal mapper. To enable accurate, fast, and scalable mapping with alignment, RawAlign implements three algorithmic improvements and hardware acceleration via a vectorized implementation of fine-grained alignment. Together, these significantly reduce the overhead of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Nanopore and Nanochannel Transport Studies · Algorithms and Data Compression
