Restricted Common Superstring and Restricted Common Supersequence
Rapha\"el Clifford, Zvi Gotthilf, Moshe Lewenstein, Alexandru Popa

TL;DR
This paper studies resource-constrained versions of the shortest common superstring and supersequence problems, establishing NP-hardness, approximation bounds, and algorithms for various constrained cases.
Contribution
It introduces the RCSstr and RCSseq problems, proves their NP-hardness in general and special cases, and provides approximation algorithms and bounds.
Findings
RCSstr is NP-complete and hard to approximate within n^{1-ε}
RCSstr remains NP-hard with binary alphabet and length-two strings
Approximation algorithms are developed for various RCSstr and RCSseq variants
Abstract
The {\em shortest common superstring} and the {\em shortest common supersequence} are two well studied problems having a wide range of applications. In this paper we consider both problems with resource constraints, denoted as the Restricted Common Superstring (shortly \textit{RCSstr}) problem and the Restricted Common Supersequence (shortly \textit{RCSseq}). In the \textit{RCSstr} (\textit{RCSseq}) problem we are given a set of strings, , , , , and a multiset , and the goal is to find a permutation to maximize the number of strings in that are substrings (subsequences) of (we call this ordering of the multiset, , a permutation of ). We first show that in its most general setting the \textit{RCSstr} problem is {\em…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Genomic variations and chromosomal abnormalities
