Reconstructing Strings from Substrings: Optimal Randomized and   Average-Case Algorithms

Kazuo Iwama; Junichi Teruyama; Shuntaro Tsuyama

arXiv:1808.00674·cs.DS·August 3, 2018

Reconstructing Strings from Substrings: Optimal Randomized and Average-Case Algorithms

Kazuo Iwama, Junichi Teruyama, Shuntaro Tsuyama

PDF

TL;DR

This paper presents randomized and average-case algorithms for reconstructing binary strings from substring queries, achieving near-optimal query complexity of n+O(1), improving upon previous methods that had an additional logarithmic factor.

Contribution

It introduces two algorithms that match the lower bound up to a constant, removing the O(log n) term from prior approaches for string reconstruction.

Findings

01

Randomized algorithm with query complexity n+O(1) with high probability.

02

Average-case algorithm with expected query complexity n+O(1).

03

Both algorithms are optimal up to a constant additive term.

Abstract

The problem called "String reconstruction from substrings" is a mathematical model of sequencing by hybridization that plays an important role in DNA sequencing. In this problem, we are given a blackbox oracle holding an unknown string $X$ and are required to obtain (reconstruct) $X$ through "substring queries" $Q (S)$ . $Q (S)$ is given to the oracle with a string $S$ and the answer of the oracle is Yes if $X$ includes $S$ as a substring and No otherwise. Our goal is to minimize the number of queries for the reconstruction. In this paper, we deal with only binary strings for $X$ whose length $n$ is given in advance by using a sequence of good $S$ 's. In 1995, Skiena and Sundaram first studied this problem and obtained an algorithm whose query complexity is $n + O (lo g n)$ . Its information theoretic lower bound is $n$ , and they posed an obvious open…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.