# Matching strings in encoded sequences

**Authors:** Adriana Coutinho, Rodrigo Lambert, J\'er\^ome Rousseau

arXiv: 1903.09625 · 2019-12-12

## TL;DR

This paper analyzes the longest common substring problem in encoded sequences, establishing a strong law of large numbers linked to Rènyi entropy, with applications to dynamical systems and stochastic models.

## Contribution

It introduces a strong law of large numbers for the longest common substring, connecting it explicitly to Rènyi entropy and extending to dynamical systems and random processes.

## Key findings

- Explicit relation between longest common substring and Rènyi entropy
- Application to zero-inflated contamination and stochastic scrabble models
- Extension to shortest distances in dynamical systems

## Abstract

We investigate the longest common substring problem for encoded sequences and its asymptotic behaviour. The main result is a strong law of large numbers for a re-scaled version of this quantity, which presents an explicit relation with the R\'enyi entropy of the source. We apply this result to the zero-inflated contamination model and the stochastic scrabble. In the case of dynamical systems, this problem is equivalent to the shortest distance between two observed orbits and its limiting relationship with the correlation dimension of the pushforward measure. An extension to the shortest distance between orbits for random dynamical systems is also provided.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.09625/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/1903.09625/full.md

---
Source: https://tomesphere.com/paper/1903.09625