On the Asymptotic Rate of Optimal Codes that Correct Tandem Duplications for Nanopore Sequencing
Wenjun Yu, Zuo Ye, Moshe Schwartz

TL;DR
This paper analyzes the asymptotic rates of optimal codes capable of correcting tandem-duplication errors in nanopore sequencing, providing exact rates or bounds depending on error regimes and parameters.
Contribution
It derives the asymptotic rate of optimal duplication-correcting codes in nanopore sequencing, addressing both unbounded and constant error regimes with new bounds and exact rates.
Findings
Exact asymptotic rate for unbounded errors when possible.
Redundancy of optimal codes is t log_q n + O(1) for constant errors when ll k.
Provides bounds on asymptotic rates depending on parameters.
Abstract
We study codes that can correct backtracking errors during nanopore sequencing. In this channel, a sequence of length over an alphabet of size is being read by a sliding window of length , where from each window we obtain only its composition. Backtracking errors cause some windows to repeat, hence manifesting as tandem-duplication errors of length in the -read vector of window compositions. While existing constructions for duplication-correcting codes can be straightforwardly adapted to this model, even resulting in optimal codes, their asymptotic rate is hard to find. In the regime of unbounded number of duplication errors, we either give the exact asymptotic rate of optimal codes, or bounds on it, depending on the values of , and . In the regime of a constant number of duplication errors, , we find the redundancy of optimal codes to be $t\log_q…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced biosensing and bioanalysis techniques · Quantum-Dot Cellular Automata · DNA and Biological Computing
