Random Access in DNA Storage: Algorithms, Constructions, and Bounds
Chen Wang, Eitan Yaakobi

TL;DR
This paper advances DNA data storage by developing algorithms for exact expected reads, deriving bounds, and proposing improved code constructions to optimize random access efficiency.
Contribution
It introduces a novel algorithm for computing expected reads with linear complexity and provides new code constructions and bounds that enhance random access performance in DNA storage.
Findings
Exact expected number of reads computed efficiently
Improved upper bounds for code constructions
Tighter lower bounds establishing optimality of simple parity code
Abstract
As DNA data storage moves closer to practical deployment, minimizing sequencing coverage depth is essential to reduce both operational costs and retrieval latency. This paper addresses the recently studied Random Access Problem, which evaluates the expected number of read samples required to recover a specific information strand from encoded strands. We propose a novel algorithm to compute the exact expected number of reads, achieving a computational complexity of for fixed field size and information length . Furthermore, we derive explicit formulas for the average and maximum expected number of reads, enabling an efficient search for optimal generator matrices under small parameters. Beyond theoretical analysis, we present new code constructions that improve the best-known upper bound from to for , and achieve an upper bound of for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Cancer Genomics and Diagnostics · Genomics and Phylogenetic Studies
