L-systems for Measuring Repetitiveness*

Gonzalo Navarro (1; 2); Cristian Urbina (1; 2) ((1); University of Chile; (2) CeBiB)

arXiv:2206.01688·cs.DS·June 6, 2022

L-systems for Measuring Repetitiveness*

Gonzalo Navarro (1, 2), Cristian Urbina (1, 2) ((1), University of Chile, (2) CeBiB)

PDF

Open Access

TL;DR

This paper investigates a measure of string repetitiveness based on L-systems, compares it with existing bounds, and introduces NU-systems that can achieve even greater compression, advancing understanding of sequence self-similarity.

Contribution

It deepens the analysis of L-system-based repetitiveness measures, compares them with substring complexity bounds, and introduces NU-systems that outperform previous methods.

Findings

01

ll and elta are largely orthogonal measures.

02

NU-systems can be asymptotically smaller than L-systems and macro-schemes.

03

The size nu of NU-systems is the smallest known measure of repetitiveness.

Abstract

An L-system (for lossless compression) is a CPD0L-system extended with two parameters $d$ and $n$ , which determines unambiguously a string $w = τ (φ^{d} (s)) [1 : n]$ , where $φ$ is the morphism of the system, $s$ is its axiom, and $τ$ is its coding. The length of the shortest description of an L-system generating $w$ is known as $ℓ$ , and is arguably a relevant measure of repetitiveness that builds on the self-similarities that arise in the sequence. In this paper we deepen the study of the measure $ℓ$ and its relation with $δ$ , a better established lower bound that builds on substring complexity. Our results show that $ℓ$ and $δ$ are largely orthogonal, in the sense that one can be much larger than the other depending on the case. This suggests that both sources of repetitiveness are mostly unrelated. We also show that the recently introduced NU-systems,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · semigroups and automata theory · Cellular Automata and Applications