On Undetected Redundancy in the Burrows-Wheeler Transform
Uwe Baier

TL;DR
This paper introduces a novel combinatorial technique to reduce the size of the Burrows-Wheeler Transform (BWT), significantly improving compression efficiency and enhancing BWT-based compressors' competitiveness.
Contribution
It presents a new method leveraging BWT's combinatorial properties to reduce its size while maintaining invertibility, applicable to any BWT-based compressor.
Findings
Achieved 8-16% average size reduction in BWT encoding
Up to 33-57% size reduction in best cases
Enhanced performance of BWT-based compressors
Abstract
The Burrows-Wheeler-Transform (BWT) is an invertible permutation of a text known to be highly compressible but also useful for sequence analysis, what makes the BWT highly attractive for lossless data compression. In this paper, we present a new technique to reduce the size of a BWT using its combinatorial properties, while keeping it invertible. The technique can be applied to any BWT-based compressor, and, as experiments show, is able to reduce the encoding size by 8-16 % on average and up to 33-57 % in the best cases (depending on the BWT-compressor used), making BWT-based compressors competitive or even superior to today's best lossless compressors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
