Coding for Sequence Reconstruction for Single Edits
Kui Cai, Han Mao Kiah, Tuan Thanh Nguyen, and Eitan Yaakobi

TL;DR
This paper studies coding strategies for reconstructing sequences from multiple noisy reads with single edit errors, showing how redundancy can be minimized as the number of reads increases, relevant for modern storage systems.
Contribution
It introduces reconstruction codes for fixed numbers of noisy reads with single edit errors, and characterizes how redundancy decreases with more reads, achieving near-optimal efficiency.
Findings
Redundancy reduces from log n to constant as reads increase.
Reconstruction codes are within one bit of optimal redundancy.
Designed codes work for all fixed numbers of noisy reads.
Abstract
The sequence reconstruction problem, introduced by Levenshtein in 2001, considers a communication scenario where the sender transmits a codeword from some codebook and the receiver obtains multiple noisy reads of the codeword. The common setup assumes the codebook to be the entire space and the problem is to determine the minimum number of distinct reads that is required to reconstruct the transmitted codeword. Motivated by modern storage devices, we study a variant of the problem where the number of noisy reads is fixed. Specifically, we design reconstruction codes that reconstruct a codeword from distinct noisy reads. We focus on channels that introduce single edit error (i.e. a single substitution, insertion, or deletion) and their variants, and design reconstruction codes for all values of . In particular, for the case of a single edit, we show that as the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Algorithms and Data Compression · DNA and Biological Computing
