Reconstructing Sets of Strings from Their k-way Projections: Algorithms & Complexity
Elise Tate, Joshua A. Grochow

TL;DR
This paper introduces the String Set Reconstruction problem, exploring its computational complexity and proposing a new algorithm based on overlap graphs to reconstruct string sets from k-way projections, with experimental validation.
Contribution
It presents the first algorithm for reconstructing string sets from k-way projections using modified overlap graphs, addressing non-contiguous k-mers and analyzing complexity.
Findings
The problem is computationally hard with inapproximability results.
The proposed algorithm is efficient and scalable based on experimental results.
Analytic approximations explain the observed complexity scaling.
Abstract
Graphs are a powerful tool for analyzing large data sets, but many real-world phenomena involve interactions that go beyond the simple pairwise relationships captured by a graph. In this paper we introduce and study a simple combinatorial model to capture higher order dependencies from an algorithms and computational complexity perspective. Specifically, we introduce the String Set Reconstruction problem, which asks when a set of strings can be reconstructed from seeing only the k-way projections of strings in the set. This problem is distinguished from genetic reconstruction problems in that we allow projections from any k indices and we maintain knowledge of those indices, but not which k-mer came from which string. We give several results on the complexity of this problem, including hardness results, inapproximability, and parametrized complexity. Our main result is the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Graph Theory and Algorithms · Advanced Graph Theory Research
