Correcting Multiple Substitutions in Nanopore-Sequencing Reads
Anisha Banerjee, Yonatan Yehezkeally, Antonia Wachter-Zeh, Eitan Yaakobi

TL;DR
This paper analyzes the error-correcting requirements for nanopore sequencing reads, establishing bounds on redundancy needed for correcting multiple substitutions, and proposes optimal coding strategies based on a simplified noise model.
Contribution
It introduces a simplified model for nanopore sequencing errors and derives bounds on the redundancy needed for correcting multiple substitutions, providing insights into optimal coding schemes.
Findings
At least t log n - O(1) bits of redundancy are needed for correcting t ≥ 2 substitutions.
Correcting a single substitution requires at most log log n - O(1) bits of redundancy.
An error-correcting code close to the theoretical bounds can be constructed based on read vector properties.
Abstract
Despite their significant advantages over competing technologies, nanopore sequencers are plagued by high error rates, due to physical characteristics of the nanopore and inherent noise in the biological processes. It is thus paramount not only to formulate efficient error-correcting constructions for these channels, but also to establish bounds on the minimum redundancy required by such coding schemes. In this context, we adopt a simplified model of nanopore sequencing inspired by the work of Mao \emph{et al.}, accounting for the effects of intersymbol interference and measurement noise. For an input sequence of length , the vector that is produced, designated as the \emph{read vector}, may additionally suffer at most \(t\) substitution errors. We employ the well-known graph-theoretic clique-cover technique to establish that at least \(t\log n -O(1)\) bits of redundancy are required…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNanopore and Nanochannel Transport Studies · Genomics and Phylogenetic Studies · Microfluidic and Capillary Electrophoresis Applications
