Multi-strand Reconstruction from Substrings
Yonatan Yehezkeally, Sagi Marcovich, Eitan Yaakobi

TL;DR
This paper studies the problem of reconstructing multiple strings simultaneously from their shared substrings of fixed length, providing bounds and constructions for codes that enable accurate reconstruction.
Contribution
It introduces the concept of multi-strand -reconstruction codes, establishes lower bounds on substring length for reconstruction, and presents two near-optimal code constructions.
Findings
Lower bounds on for successful reconstruction
Two code constructions with rates approaching 1
Asymptotic analysis of code rates and bounds
Abstract
The problem of string reconstruction based on its substrings spectrum has received significant attention recently due to its applicability to DNA data storage and sequencing. In contrast to previous works, we consider in this paper a setup of this problem where multiple strings are reconstructed together. Given a multiset of strings, all their substrings of some fixed length , defined as the -profile of , are received and the goal is to reconstruct all strings in . A multi-strand -reconstruction code is a set of multisets such that every element can be reconstructed from its -profile. Given the number of strings~ and their length~, we first find a lower bound on the value of necessary for existence of multi-strand -reconstruction codes with non-vanishing asymptotic rate. We then present two constructions of such codes and show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
