Coded Information Retrieval for Block-Structured DNA-Based Data Storage

Daniella Bar-Lev

arXiv:2603.17154·cs.IT·March 19, 2026

Coded Information Retrieval for Block-Structured DNA-Based Data Storage

Daniella Bar-Lev

PDF

Open Access

TL;DR

This paper investigates the optimal retrieval times for block-structured DNA data storage using linear codes, deriving bounds, analyzing specific code families, and characterizing the asymptotic achievable region.

Contribution

It introduces a formalized block-structured retrieval problem, derives new bounds, analyzes specific code constructions, and characterizes the asymptotic retrieval time region.

Findings

01

Hyperbolic constraint on expected retrieval times for no-mixed-column codes

02

File-dedicated MDS codes are optimal within their family

03

Asymptotic boundary achieved by file-dedicated MDS codes

Abstract

We study the problem of coded information retrieval for block-structured data, motivated by DNA-based storage systems where a database is partitioned into multiple files that must each be recoverable as an atomic unit. We initiate and formalize the block-structured retrieval problem, wherein $k$ information symbols are partitioned into two files $F_{1}$ and $F_{2}$ of sizes $s_{1}$ and $s_{2} = k - s_{1}$ . The objective is to characterize the set of achievable expected retrieval time pairs $(E_{1} (G), E_{2} (G))$ over all $[n, k]$ linear codes with generator matrix $G$ . We derive a family of linear lower bounds via mutual exclusivity of recovery sets, and develop a nonlinear geometric bound via column projection. For codes with no mixed columns, this yields the hyperbolic constraint $s_{1} / E_{1} + s_{2} / E_{2} \leq 1$ , which we conjecture to hold universally whenever $max {s_{1}, s_{2}} \geq 2$ . We analyze…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDNA and Biological Computing · Advanced Data Storage Technologies · Algorithms and Data Compression