Coding for Composite DNA to Correct Substitutions, Strand Losses, and Deletions
Frederik Walter, Omer Sabary, Antonia Wachter-Zeh, Eitan Yaakobi

TL;DR
This paper develops coding strategies for composite DNA data storage that effectively correct substitutions, strand losses, and deletions, providing theoretical bounds and explicit constructions to enhance data integrity.
Contribution
It introduces novel coding techniques tailored for composite DNA storage, including bounds and explicit constructions for correcting multiple error types.
Findings
Derived non-asymptotic upper bounds on code sizes for multiple error types
Presented explicit code constructions achieving these bounds
Enhanced reliability of DNA data storage systems
Abstract
Composite DNA is a recent method to increase the base alphabet size in DNA-based data storage.This paper models synthesizing and sequencing of composite DNA and introduces coding techniques to correct substitutions, losses of entire strands, and symbol deletion errors. Non-asymptotic upper bounds on the size of codes with occurrences of these error types are derived. Explicit constructions are presented which can achieve the bounds.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · DNA and Nucleic Acid Chemistry · RNA and protein synthesis mechanisms
