Coding for Trace Reconstruction over Multiple Channels with Vanishing Deletion Probabilities
Serge Kas Hanna

TL;DR
This paper introduces a new coding scheme for trace reconstruction in DNA storage, effective with multiple traces and low deletion probabilities, achieving efficient reconstruction with a constant number of traces.
Contribution
We propose a novel code design enabling efficient trace reconstruction from a constant number of traces in the vanishing deletion probability regime.
Findings
The code achieves successful reconstruction with high probability as sequence length grows.
Simulation results show improved performance over existing methods in terms of edit distance error.
Theoretical analysis confirms the code's effectiveness in the low deletion probability regime.
Abstract
Motivated by DNA-based storage applications, we study the problem of reconstructing a coded sequence from multiple traces. We consider the model where the traces are outputs of independent deletion channels, where each channel deletes each bit of the input codeword \(\mathbf{x} \in \{0,1\}^n\) independently with probability \(p\). We focus on the regime where the deletion probability \(p \to 0\) when \(n\to \infty\). Our main contribution is designing a novel code for trace reconstruction that allows reconstructing a coded sequence efficiently from a constant number of traces. We provide theoretical results on the performance of our code in addition to simulation results where we compare the performance of our code to other reconstruction techniques in terms of the edit distance error.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · DNA and Biological Computing · Algorithms and Data Compression
