Coded Information Retrieval for Block-Structured DNA-Based Data Storage
Daniella Bar-Lev

TL;DR
This paper investigates the optimal retrieval times for block-structured DNA data storage using linear codes, deriving bounds, analyzing specific code families, and characterizing the asymptotic achievable region.
Contribution
It introduces a formalized block-structured retrieval problem, derives new bounds, analyzes specific code constructions, and characterizes the asymptotic retrieval time region.
Findings
Hyperbolic constraint on expected retrieval times for no-mixed-column codes
File-dedicated MDS codes are optimal within their family
Asymptotic boundary achieved by file-dedicated MDS codes
Abstract
We study the problem of coded information retrieval for block-structured data, motivated by DNA-based storage systems where a database is partitioned into multiple files that must each be recoverable as an atomic unit. We initiate and formalize the block-structured retrieval problem, wherein information symbols are partitioned into two files and of sizes and . The objective is to characterize the set of achievable expected retrieval time pairs over all linear codes with generator matrix . We derive a family of linear lower bounds via mutual exclusivity of recovery sets, and develop a nonlinear geometric bound via column projection. For codes with no mixed columns, this yields the hyperbolic constraint , which we conjecture to hold universally whenever . We analyze…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Advanced Data Storage Technologies · Algorithms and Data Compression
