Optimal mean-based algorithms for trace reconstruction
Anindya De, Ryan O'Donnell, Rocco Servedio

TL;DR
This paper establishes tight bounds for mean-based algorithms in trace reconstruction, showing they require exponential time and samples proportional to \\(n^{1/3}\\")
Contribution
It provides matching upper and lower bounds for mean-based trace reconstruction, extending results to various deletion probabilities and incorporating insertions and bit-flips.
Findings
Mean-based algorithms need exponential time and samples of order \\(exp(n^{1/3})\\")
Matching bounds are proven for deletion probabilities \\(\delta \\) in different regimes
Insertions and bit-flips can be handled, with insertions aiding reconstruction when \\(\delta > 1/2\\"
Abstract
In the (deletion-channel) trace reconstruction problem, there is an unknown -bit source string . An algorithm is given access to independent traces of , where a trace is formed by deleting each bit of~ independently with probability~. The goal of the algorithm is to recover~ exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein~et~al.; it uses samples and running time for any fixed . It is also what we call a "mean-based algorithm", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein~et~al.~also gave a lower bound, showing that any mean-based algorithm must use at least samples. In this paper we improve both of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Advanced Data Storage Technologies
