Geno-Weaving: Low-Complexity Capacity-Achieving DNA Storage
Hsin-Po Wang, Venkatesan Guruswami

TL;DR
This paper proposes a low-complexity, capacity-achieving coding scheme for DNA data storage that effectively reconstructs data from noisy, unsorted reads by combining rateless coding with capacity-achieving block codes.
Contribution
It introduces a novel coding scheme that weaves rateless and block codes to achieve capacity in DNA storage with noisy, unordered reads.
Findings
Achieves DNA storage capacity with low complexity.
Effectively reconstructs data from noisy, unordered reads.
Combines rateless and block coding techniques.
Abstract
As a possible implementation of data storage using DNA, multiple strands of DNA are stored in a liquid container so that, in the future, they can be read by an array of DNA readers in parallel. These readers will sample the strands with replacement to produce a random number of noisy reads for each strand. An essential component of such a data storage system is how to reconstruct data out of these unsorted, repetitive, and noisy reads. It is known that if a single read can be modeled by a substitution channel , then the overall capacity can be expressed by the "Poisson-ization" of . In this paper, we lay down a rateless code along each strand to encode its index; we then lay down a capacity-achieving block code at the same position across all strands to protect data. That weaves a low-complexity coding scheme that achieves DNA's capacity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced biosensing and bioanalysis techniques
