Optimizing Sequencing Coverage Depth in DNA Storage: Insights From DNA Storage Data
Ruiying Cao, Xin Chen

TL;DR
This paper investigates how to optimize DNA sequencing coverage depth in DNA storage systems by analyzing real experimental data, extending previous theoretical models to more realistic log-normal distribution channels, and providing practical insights.
Contribution
It introduces a novel analysis of DNA storage coverage depth using real data, extending models from uniform to log-normal channels, and explores both noiseless and noisy scenarios.
Findings
Positive correlation between MDS code rate and minimum coverage depth
Decoding success probability varies with code rate and sample size
Extended lower bounds for coverage depth in log-normal noisy channels
Abstract
DNA storage is now being considered as a new archival storage method for its durability and high information density, but still facing some challenges like high costs and low throughput. By reducing sequencing sample size for decoding digital data, minimizing DNA coverage depth helps lower both costs and system latency. Previous studies have mainly focused on minimizing coverage depth in uniform distribution channels under theoretical assumptions. In contrast, our work uses real DNA storage experimental data to extend this problem to log-normal distribution channels, a conclusion derived from our PCR and sequencing data analysis. In this framework, we investigate both noiseless and noisy channels. We first demonstrate a detailed positive correlation between MDS code rate and the expected minimum sequencing coverage depth. Moreover, we observe that the probability of successfully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
