Making it to First: The Random Access Problem in DNA Storage
Avital Boruchovsky, Ohad Elishco, Ryan Gabrys, Anina Gruica, Itzhak Tamo, and Eitan Yaakobi

TL;DR
This paper investigates the random access problem in DNA storage, providing optimal code designs for small cases and generalized constructions that outperform previous methods for larger data sets.
Contribution
It fully solves the case for two strands and introduces a generalized construction for three or more strands that improves random access efficiency.
Findings
Optimal code for k=2 achieves expectation of approximately 0.914 times the number of strands.
Generalized construction for k≥4 outperforms previous methods in reducing expected reads.
Construction uses B_{k-1} sequences over large finite fields, ensuring existence and efficiency.
Abstract
In this paper, we study the Random Access Problem in DNA storage, which addresses the challenge of retrieving a specific information strand from a DNA-based storage system. In this framework, the data is represented by information strands which represent the data and are encoded into strands using a linear code. Then, each sequencing read returns one encoded strand which is chosen uniformly at random. The goal under this paradigm is to design codes that minimize the expected number of reads required to recover an arbitrary information strand. We fully solve the case when , showing that the best possible code attains a random access expectation of for large enough. Moreover, we generalize a construction from~\cite{GMZ24}, specifically to , for any value of . Our construction uses sequences over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing
