# Coded trace reconstruction

**Authors:** Mahdi Cheraghchi, Ryan Gabrys, Olgica Milenkovic, Jo\~ao Ribeiro

arXiv: 1903.09992 · 2019-09-11

## TL;DR

This paper introduces coded trace reconstruction, designing high-rate codes that can be efficiently decoded from few traces with deletions, advancing DNA storage reliability with provable guarantees.

## Contribution

It presents the first provably robust coded trace reconstruction methods with explicit constructions and analysis for DNA storage systems.

## Key findings

- Codes with $O(n/\log n)$ redundancy reconstructed from exponential number of traces.
- Codes with $O(\log n)$ redundancy reconstructed from polynomial number of traces under small deletion probability.
- Combined approach yields efficient codes with $O(n/\log n)$ redundancy from polynomially many traces.

## Abstract

Motivated by average-case trace reconstruction and coding for portable DNA-based storage systems, we initiate the study of \emph{coded trace reconstruction}, the design and analysis of high-rate efficiently encodable codes that can be efficiently decoded with high probability from few reads (also called \emph{traces}) corrupted by edit errors. Codes used in current portable DNA-based storage systems with nanopore sequencers are largely based on heuristics, and have no provable robustness or performance guarantees even for an error model with i.i.d.\ deletions and constant deletion probability. Our work is a first step towards the design of efficient codes with provable guarantees for such systems. We consider a constant rate of i.i.d.\ deletions, and perform an analysis of marker-based code-constructions. This gives rise to codes with redundancy $O(n/\log n)$ (resp.\ $O(n/\log\log n)$) that can be efficiently reconstructed from $\exp(O(\log^{2/3}n))$ (resp.\ $\exp(O(\log\log n)^{2/3})$) traces, where $n$ is the message length. Then, we give a construction of a code with $O(\log n)$ bits of redundancy that can be efficiently reconstructed from $\textrm{poly}(n)$ traces if the deletion probability is small enough. Finally, we show how to combine both approaches, giving rise to an efficient code with $O(n/\log n)$ bits of redundancy which can be reconstructed from $\textrm{poly}(\log n)$ traces for a small constant deletion probability.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.09992/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/1903.09992/full.md

---
Source: https://tomesphere.com/paper/1903.09992