Improved Approximation Guarantees for Shortest Superstrings using Cycle Classification by Overlap to Length Ratios
Matthias Englert, Nicolaos Matsakis, Pavel Vesel\'y

TL;DR
This paper improves the approximation guarantees for the Shortest Superstring problem, showing that the GREEDY algorithm has a better upper bound and providing a new approximation ratio for the problem.
Contribution
It presents the first improvement on the GREEDY algorithm's approximation ratio since 2005 and offers a new, tighter approximation factor for the Shortest Superstring problem.
Findings
GREEDY algorithm's approximation ratio is at most approximately 3.425.
Shortest Superstring problem can be approximated within a factor of approximately 2.475.
Provides the first progress on GREEDY's approximation guarantee since 2005.
Abstract
In the Shortest Superstring problem, we are given a set of strings and we are asking for a common superstring, which has the minimum number of characters. The Shortest Superstring problem is NP-hard and several constant-factor approximation algorithms are known for it. Of particular interest is the GREEDY algorithm, which repeatedly merges two strings of maximum overlap until a single string remains. The GREEDY algorithm, being simpler than other well-performing approximation algorithms for this problem, has attracted attention since the 1980s and is commonly used in practical applications. Tarhio and Ukkonen (TCS 1988) conjectured that GREEDY gives a 2-approximation. In a seminal work, Blum, Jiang, Li, Tromp, and Yannakakis (STOC 1991) proved that the superstring computed by GREEDY is a 4-approximation, and this upper bound was improved to 3.5 by Kaplan and Shafrir (IPL 2005). We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · Machine Learning and Algorithms
