
TL;DR
This paper introduces an online variant of string attractors, demonstrating that Lempel-Ziv is optimal for this problem with a competitive ratio of O(log n), and establishing lower bounds for specific infinite words like Fibonacci.
Contribution
It defines the first online version of string attractors and analyzes its competitive ratio, connecting it to Lempel-Ziv and studying bounds for special infinite words.
Findings
Lempel-Ziv factorization is the optimal online algorithm with O(log n) competitive ratio.
Certain infinite words like Fibonacci have a lower bound of Ω(log n) for online attractor costs.
The online k-attractor problem is shown to be strictly k-competitive.
Abstract
In today's data-centric world, fast and effective compression of data is paramount. To measure success towards the second goal, Kempa and Prezza [STOC2018] introduce the string attractor, a combinatorial object unifying dictionary-based compression. Given a string , a string attractor (-attractor) is a set of positions , such that every distinct substring (of length at most ) has at least one occurrence that contains one of the selected positions. String attractors are shown to be approximated by and thus measure the quality of many important dictionary compression algorithms such as Lempel-Ziv 77, the Burrows-Wheeler transform, straight line programs, and macro schemes. In order to handle massive amounts of data, compression often has to be achieved in a streaming fashion. Thus, practically applied compression algorithms, such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · IoT-based Smart Home Systems
