A New Algebraic Approach for String Reconstruction from Substring Compositions
Utkarsh Gupta, Hessam Mahdavifar

TL;DR
This paper introduces an algebraic algorithm for binary string reconstruction from substring compositions, achieving polynomial time complexity and larger reconstructable codebooks, improving upon previous methods.
Contribution
The paper presents a novel algebraic approach that guarantees polynomial-time reconstruction without backtracking and extends the class of uniquely reconstructable binary strings.
Findings
Algorithm has $O(n^2)$ complexity without backtracking.
Larger sets of binary strings are uniquely reconstructable.
Reconstruction codebooks are larger by a linear factor.
Abstract
We consider the problem of binary string reconstruction from the multiset of its substring compositions, i.e., referred to as the substring composition multiset, first introduced and studied by Acharya et al. We introduce a new algorithm for the problem of string reconstruction from its substring composition multiset which relies on the algebraic properties of the equivalent bivariate polynomial formulation of the problem. We then characterize specific algebraic conditions for the binary string to be reconstructed that guarantee the algorithm does not require any backtracking through the reconstruction, and, consequently, the time complexity is bounded polynomially. More specifically, in the case of no backtracking, our algorithm has a time complexity of compared to the algorithm by Acharya et al., which has a time complexity of , where is the length of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Cellular Automata and Applications
