Reconstructing Strings from Substrings: Optimal Randomized and Average-Case Algorithms
Kazuo Iwama, Junichi Teruyama, Shuntaro Tsuyama

TL;DR
This paper presents randomized and average-case algorithms for reconstructing binary strings from substring queries, achieving near-optimal query complexity of n+O(1), improving upon previous methods that had an additional logarithmic factor.
Contribution
It introduces two algorithms that match the lower bound up to a constant, removing the O(log n) term from prior approaches for string reconstruction.
Findings
Randomized algorithm with query complexity n+O(1) with high probability.
Average-case algorithm with expected query complexity n+O(1).
Both algorithms are optimal up to a constant additive term.
Abstract
The problem called "String reconstruction from substrings" is a mathematical model of sequencing by hybridization that plays an important role in DNA sequencing. In this problem, we are given a blackbox oracle holding an unknown string and are required to obtain (reconstruct) through "substring queries" . is given to the oracle with a string and the answer of the oracle is Yes if includes as a substring and No otherwise. Our goal is to minimize the number of queries for the reconstruction. In this paper, we deal with only binary strings for whose length is given in advance by using a sequence of good 's. In 1995, Skiena and Sundaram first studied this problem and obtained an algorithm whose query complexity is . Its information theoretic lower bound is , and they posed an obvious open…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
