On the Reliability of Information Retrieval From MDS Coded Data in DNA Storage
Serge Kas Hanna

TL;DR
This paper provides a theoretical analysis of the success probability of retrieving data encoded with MDS codes in DNA storage, considering substitution errors and system parameters.
Contribution
It offers a mathematical framework to optimize DNA storage reliability by analyzing how various factors affect data retrieval success.
Findings
Success probability depends on sequencing reads and error rates.
Optimal code rate balance improves retrieval reliability.
Minimum reads needed for reliable data recovery identified.
Abstract
This work presents a theoretical analysis of the probability of successfully retrieving data encoded with MDS codes (e.g., Reed-Solomon codes) in DNA storage systems. We study this probability under independent and identically distributed (i.i.d.) substitution errors, focusing on a common code design strategy that combines inner and outer MDS codes. Our analysis demonstrates how this probability depends on factors such as the total number of sequencing reads, their distribution across strands, the rates of the inner and outer codes, and the substitution error probabilities. These results provide actionable insights into optimizing DNA storage systems under reliability constraints, including determining the minimum number of sequencing reads needed for reliable data retrieval and identifying the optimal balance between the rates of inner and outer MDS codes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
