Novel Results on the Number of Runs of the Burrows-Wheeler-Transform
Sara Giuliani, Shunsuke Inenaga, Zsuzsanna Lipt\'ak, Nicola, Prezza, Marinella Sciortino, Anna Toffanello

TL;DR
This paper investigates the ratio of BWT run counts between a string and its reverse, providing new bounds and insights into the measure of string repetitiveness relevant for data compression and indexing.
Contribution
It establishes the first non-trivial lower bounds on the ratio of BWT runs between a string and its reverse, challenging the adequacy of run count as a measure of repetitiveness.
Findings
Infinite families of strings with ratio Θ(log n)
Upper bound of O(log^2 n) from prior work
Implication that run count may not fully capture string repetitiveness
Abstract
The Burrows-Wheeler-Transform (BWT), a reversible string transformation, is one of the fundamental components of many current data structures in string processing. It is central in data compression, as well as in efficient query algorithms for sequence data, such as webpages, genomic and other biological sequences, or indeed any textual data. The BWT lends itself well to compression because its number of equal-letter-runs (usually referred to as ) is often considerably lower than that of the original string; in particular, it is well suited for strings with many repeated factors. In fact, much attention has been paid to the parameter as measure of repetitiveness, especially to evaluate the performance in terms of both space and time of compressed indexing data structures. In this paper, we investigate , the ratio of and of the number of runs of the BWT of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
