Achieving the Capacity of a DNA Storage Channel with Linear Coding Schemes
Kel Levick (1), Reinhard Heckel (2), Ilan Shomorony (1) ((1), University of Illinois Urbana-Champaign, (2) Technical University of Munich)

TL;DR
This paper demonstrates that linear coding schemes can achieve the capacity of a DNA storage channel affected by erasures, simplifying the derivation of capacity in a complex multi-draw sampling model.
Contribution
It shows that linear coding schemes are sufficient to achieve capacity in a DNA storage channel with erasures, simplifying previous complex derivations.
Findings
Linear coding schemes achieve the channel capacity.
Simplified derivation of capacity expression.
Applicable to DNA storage with erasure noise.
Abstract
Due to the redundant nature of DNA synthesis and sequencing technologies, a basic model for a DNA storage system is a multi-draw "shuffling-sampling" channel. In this model, a random number of noisy copies of each sequence is observed at the channel output. Recent works have characterized the capacity of such a DNA storage channel under different noise and sequencing models, relying on sophisticated typicality-based approaches for the achievability. Here, we consider a multi-draw DNA storage channel in the setting of noise corruption by a binary erasure channel. We show that, in this setting, the capacity is achieved by linear coding schemes. This leads to a considerably simpler derivation of the capacity expression of a multi-draw DNA storage channel than existing results in the literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Advanced biosensing and bioanalysis techniques · Algorithms and Data Compression
