On Coding for an Abstracted Nanopore Channel for DNA Storage
Reyna Hulett, Shubham Chandak, Mary Wootters

TL;DR
This paper investigates the theoretical capacity of an abstracted nanopore channel model for DNA storage, proposing new coding schemes and algorithms to improve data encoding and decoding efficiency.
Contribution
It introduces a highly abstracted deterministic model of nanopore sequencing, providing new theoretical insights and practical coding solutions for DNA data storage.
Findings
Derived capacity bounds for the abstracted nanopore model
Developed efficient coding schemes for DNA storage
Proposed algorithms for encoding and decoding
Abstract
In the emerging field of DNA storage, data is encoded as DNA sequences and stored. The data is read out again by sequencing the stored DNA. Nanopore sequencing is a new sequencing technology that has many advantages over other methods; in particular, it is cheap, portable, and can support longer reads. While several practical coding schemes have been developed for DNA storage with nanopore sequencing, the theory is not well understood. Towards that end, we study a highly abstracted (deterministic) version of the nanopore sequencer, which highlights key features that make its analysis difficult. We develop methods and theory to understand the capacity of our abstracted model, and we propose efficient coding schemes and algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
