Substring Density Estimation from Traces
Kayvon Mazooji, Ilan Shomorony

TL;DR
This paper introduces a method to estimate the density map of substrings within a binary string from traces, requiring polynomially many traces, which improves understanding of trace reconstruction for certain string classes.
Contribution
It presents a novel approach to recover substring density maps from traces with polynomial trace complexity, extending trace reconstruction to a new density estimation problem.
Findings
Polynomial trace complexity for density map recovery
Effective density map estimation with error at most epsilon
Trace reconstruction for strings with large density map distance
Abstract
In the trace reconstruction problem, one seeks to reconstruct a binary string from a collection of traces, each of which is obtained by passing through a deletion channel. It is known that traces suffice to reconstruct any length- string with high probability. We consider a variant of the trace reconstruction problem where the goal is to recover a "density map" that indicates the locations of each length- substring throughout . We show that traces suffice to recover the density map with error at most . As a result, when restricted to a set of source strings whose minimum "density map distance" is at least , the trace reconstruction problem can be solved with polynomially many traces.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Genomics and Phylogenetic Studies
