Batch Optimization for DNA Synthesis
Konstantin Makarychev, Miklos Z. Racz, Cyrus Rashtchian, Sergey, Yekhanin

TL;DR
This paper introduces batch optimization techniques to reduce the cost of large-scale DNA synthesis for data storage, demonstrating significant savings especially for non-repetitive DNA sequences.
Contribution
It proposes two novel batch optimization methods and proves their asymptotic optimality, highlighting the impact of sequence constraints on synthesis cost savings.
Findings
Batch optimization reduces DNA synthesis costs.
Using reverse reference strands improves batching efficiency.
Cost savings are greater for non-repetitive DNA sequences.
Abstract
Large pools of synthetic DNA molecules have been recently used to reliably store significant volumes of digital data. While DNA as a storage medium has enormous potential because of its high storage density, its practical use is currently severely limited because of the high cost and low throughput of available DNA synthesis technologies. We study the role of batch optimization in reducing the cost of large scale DNA synthesis, which translates to the following algorithmic task. Given a large pool of random quaternary strings of fixed length, partition into batches in a way that minimizes the sum of the lengths of the shortest common supersequences across batches. We introduce two ideas for batch optimization that both improve (in different ways) upon a naive baseline: (1) using both and its reverse as reference strands, and batching…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced biosensing and bioanalysis techniques · DNA and Biological Computing · Algorithms and Data Compression
