On the Complexity of BWT-runs Minimization via Alphabet Reordering
Jason Bentley, Daniel Gibney, Sharma V. Thankachan

TL;DR
This paper investigates the computational complexity of minimizing BWT-runs through alphabet reordering, proving NP-completeness, APX-hardness, and providing approximation algorithms and special case solutions.
Contribution
It establishes the NP-completeness and APX-hardness of BWT-runs minimization via alphabet reordering and offers an optimal algorithm for specific cases.
Findings
Decision problem is NP-complete.
Optimization problem is APX-hard.
An $O(rac{n}{ ext{some factor}})$ approximation exists.
Abstract
The Burrows-Wheeler Transform (BWT) has been an essential tool in text compression and indexing. First introduced in 1994, it went on to provide the backbone for the first encoding of the classic suffix tree data structure in space close to the entropy-based lower bound. Recently, there has been the development of compact suffix trees in space proportional to "", the number of runs in the BWT, as well as the appearance of in the time complexity of new algorithms. Unlike other popular measures of compression, the parameter is sensitive to the lexicographic ordering given to the text's alphabet. Despite several past attempts to exploit this, a provably efficient algorithm for finding, or approximating, an alphabet ordering which minimizes has been open for years. We present the first set of results on the computational complexity of minimizing BWT-runs via alphabet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
