Bidirectional Text Compression in External Memory
Patrick Dinklage, Jonas Ellert, Johannes Fischer, Dominik K\"oppl,, Manuel Penschuck

TL;DR
This paper introduces a bidirectional text compression algorithm optimized for external memory, demonstrating faster performance than LZ77 variants on large datasets while maintaining comparable compression ratios.
Contribution
It presents a novel bidirectional compression algorithm suitable for external memory, along with an external decompressor compatible with various schemes.
Findings
Faster compression than existing LZ77 algorithms on large datasets.
Achieves similar compression ratios to traditional LZ77 methods.
Operates efficiently with limited RAM on very large data.
Abstract
Bidirectional compression algorithms work by substituting repeated substrings by references that, unlike in the famous LZ77-scheme, can point to either direction. We present such an algorithm that is particularly suited for an external memory implementation. We evaluate it experimentally on large data sets of size up to 128 GiB (using only 16 GiB of RAM) and show that it is significantly faster than all known LZ77 compressors, while producing a roughly similar number of factors. We also introduce an external memory decompressor for texts compressed with any uni- or bidirectional compression scheme.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
