On the size of error ball in DNA storage channels
Aryan Abbasian, Mahtab Mirmohseni, Masoumeh Nasiri Kenari

TL;DR
This paper analyzes the size of the error ball in DNA storage channels with multiple error types and up to three edits, which is crucial for sequence reconstruction and improving DNA data storage reliability.
Contribution
It derives the size of the error ball for channels with combined error types and limited edits, advancing understanding of DNA storage error correction.
Findings
Derived the size of the error ball for channels with multiple error types.
Provided formulas for error balls with up to three edits.
Enhanced the theoretical foundation for DNA storage error correction.
Abstract
Recent experiments have demonstrated the feasibility of storing digital information in macromolecules such as DNA and protein. However, the DNA storage channel is prone to errors such as deletions, insertions, and substitutions. During the synthesis and reading phases of DNA strings, many noisy copies of the original string are generated. The problem of recovering the original string from these noisy copies is known as sequence reconstruction. A key concept in this problem is the error ball, which is the set of all possible sequences that can result from a limited number of errors applied to the original sequence. Levenshtein showed that the minimum number of noisy copies required for a given channel to recover the original sequence is equal to one plus the maximum size of the intersection of two error balls. Therefore, deriving the size of the error ball for any channel and any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Algorithms and Data Compression · Cellular Automata and Applications
