Generalization of Repetitiveness Measures for Two-Dimensional Strings
Lorenzo Carfagna, Giovanni Manzini, Giuseppe Romana, Marinella Sciortino, Cristian Urbina

TL;DR
This paper extends measures of string repetitiveness from one-dimensional to multi-dimensional strings, analyzing their properties, relationships, and implications for data compression and access efficiency.
Contribution
It introduces and compares new complexity measures for two-dimensional strings, exploring their relationships and providing a grammar-based representation with efficient symbol access.
Findings
Measures become incomparable for dimensions ≥ 2
Grammar-based representation allows O(log N) access time
Insights for designing effective 2D compressors
Abstract
The problem of detecting and measuring the repetitiveness of one-dimensional strings has been extensively studied in data compression and text indexing. Our understanding of these issues has been significantly improved by the introduction of the notion of string attractor [Kempa and Prezza, STOC 2018] and by the results showing the relationship between attractors and other measures of compressibility. When the input data are structured in a non-linear way, as in two-dimensional strings, inherent redundancy often offers an even richer source for compression. However, systematic studies on repetitiveness measures for two-dimensional strings are still scarce. In this paper we extend to two or more dimensions the main measures of complexity introduced for one-dimensional strings. We distinguish between the measures and , defined in terms of the substrings of the input,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · semigroups and automata theory · Computability, Logic, AI Algorithms
