Longest common substring for random subshifts of finite type

Jerome Rousseau

arXiv:1905.08131·math.DS·November 24, 2020

Longest common substring for random subshifts of finite type

Jerome Rousseau

PDF

TL;DR

This paper investigates the behavior of the longest common substring in random subshifts of finite type and random sequences, linking it to Rènyi entropy under exponential mixing conditions, with a focus on quenched results.

Contribution

It establishes a connection between the longest common substring behavior and Rènyi entropy in random subshifts, providing quenched results under mixing assumptions.

Findings

01

Behavior linked to Rènyi entropy

02

Results hold under exponential mixing

03

Focus on quenched analysis

Abstract

In this paper, we study the behaviour of the longest common substring for random subshifts of finite type (for dynamicists) or of the longest common substring for random sequences in random environments (for probabilists). We prove that, under some exponential mixing assumptions, this behaviour is linked to the R\'enyi entropy of the stationary measure. We emphasize that what we establish is a quenched result.

Equations211

A C AA T G A G A GG A T G A C C T T G

A C AA T G A G A GG A T G A C C T T G

T G A C T GT AA C T G A C A C AA GC

T G A C T GT AA C T G A C A C AA GC

M_{n} (x, y) = max {m : x_{i + k} = y_{j + k} for k = 1, \dots, m and for some 0 \leq i, j \leq n - m} .

M_{n} (x, y) = max {m : x_{i + k} = y_{j + k} for k = 1, \dots, m and for some 0 \leq i, j \leq n - m} .

n \to \infty lim \frac{M _{n} ( x , y )}{lo g n} = \frac{2}{- lo g p}

n \to \infty lim \frac{M _{n} ( x , y )}{lo g n} = \frac{2}{- lo g p}

H_{2} (μ) = k \to \infty lim \frac{lo g \sum μ ( C _{k} ) ^{2}}{- k}

H_{2} (μ) = k \to \infty lim \frac{lo g \sum μ ( C _{k} ) ^{2}}{- k}

n \to \infty lim \frac{M _{n} ( x , y )}{lo g n} = \frac{2}{H _{2} ( μ )} .

n \to \infty lim \frac{M _{n} ( x , y )}{lo g n} = \frac{2}{H _{2} ( μ )} .

m_{n} (x, y) = i, j = 0, \dots, n - 1 min (d (T^{i} x, T^{j} y)) .

m_{n} (x, y) = i, j = 0, \dots, n - 1 min (d (T^{i} x, T^{j} y)) .

E_{ω} = {x = (x_{0}, x_{1}, \dots) : x_{i} \in X_{θ^{i} ω} and a_{x_{i} x_{i + 1}} (θ^{i} ω) = 1 for all i \in N} \subset X,

E_{ω} = {x = (x_{0}, x_{1}, \dots) : x_{i} \in X_{θ^{i} ω} and a_{x_{i} x_{i + 1}} (θ^{i} ω) = 1 for all i \in N} \subset X,

E = {(ω, x) : ω \in Ω, x \in E_{ω}} \subset Ω \times X .

E = {(ω, x) : ω \in Ω, x \in E_{ω}} \subset Ω \times X .

(σ^{i})_{*} μ_{ω} = μ_{θ^{i} ω} for all i \in N .

(σ^{i})_{*} μ_{ω} = μ_{θ^{i} ω} for all i \in N .

M_{n} (x, y) = max {m : x_{i + k} = y_{j + k} for k = 1, \dots, m and for some 0 \leq i, j \leq n - m} .

M_{n} (x, y) = max {m : x_{i + k} = y_{j + k} for k = 1, \dots, m and for some 0 \leq i, j \leq n - m} .

\underline{H}_{2} (μ) = k \to \infty \underline{lim} \frac{lo g \sum μ ( C _{k} ) ^{2}}{- k} \mbox an d \overline{H}_{2} (μ) = k \to \infty \overline{lim} \frac{lo g \sum μ ( C _{k} ) ^{2}}{- k},

\underline{H}_{2} (μ) = k \to \infty \underline{lim} \frac{lo g \sum μ ( C _{k} ) ^{2}}{- k} \mbox an d \overline{H}_{2} (μ) = k \to \infty \overline{lim} \frac{lo g \sum μ ( C _{k} ) ^{2}}{- k},

h_{0} = k \to + \infty \underline{lim} \frac{lo g \int _{Ω} C _{k} max μ _{ω} ( C _{k} ) d P}{- k}

h_{0} = k \to + \infty \underline{lim} \frac{lo g \int _{Ω} C _{k} max μ _{ω} ( C _{k} ) d P}{- k}

μ (A \cap σ^{- g - n} A) - μ (A)^{2} \leq α (g);

μ (A \cap σ^{- g - n} A) - μ (A)^{2} \leq α (g);

μ_{ω} (A \cap σ^{- g - n} B) - μ_{ω} (A) μ_{θ^{n + g} ω} (B) \leq α (g) .

μ_{ω} (A \cap σ^{- g - n} B) - μ_{ω} (A) μ_{θ^{n + g} ω} (B) \leq α (g) .

μ (A \cap σ^{- g - n} B) - μ (A) μ (B) \leq α (g)

μ (A \cap σ^{- g - n} B) - μ (A) μ (B) \leq α (g)

n \to \infty \overline{lim} \frac{M _{n} ( x , y )}{lo g n} \leq \frac{2}{H _{2} ( μ )} for ν \otimes ν -almost every ((ω, x), (\tilde{ω}, y)) \in E \times E .

n \to \infty \overline{lim} \frac{M _{n} ( x , y )}{lo g n} \leq \frac{2}{H _{2} ( μ )} for ν \otimes ν -almost every ((ω, x), (\tilde{ω}, y)) \in E \times E .

n \to \infty \underline{lim} \frac{M _{n} ( x , y )}{lo g n} \geq \frac{2}{H _{2} ( μ )} for ν \otimes ν -almost every ((ω, x), (\tilde{ω}, y)) \in E \times E .

n \to \infty \underline{lim} \frac{M _{n} ( x , y )}{lo g n} \geq \frac{2}{H _{2} ( μ )} for ν \otimes ν -almost every ((ω, x), (\tilde{ω}, y)) \in E \times E .

n \to \infty \overline{lim} \frac{M _{n} ( x , y )}{lo g n} \leq \frac{2}{H _{2} ( μ )} for μ_{ω} \otimes μ_{ω} -almost every (x, y) \in E_{ω} \times E_{ω} .

n \to \infty \overline{lim} \frac{M _{n} ( x , y )}{lo g n} \leq \frac{2}{H _{2} ( μ )} for μ_{ω} \otimes μ_{ω} -almost every (x, y) \in E_{ω} \times E_{ω} .

\int_{Ω} ψ . ϕ \circ θ^{n + g} d P - \int_{Ω} ψ d P \int_{Ω} ϕ d P \leq ρ (g) ∥ ψ ∥_{2} ∥ ϕ ∥_{2}

\int_{Ω} ψ . ϕ \circ θ^{n + g} d P - \int_{Ω} ψ d P \int_{Ω} ϕ d P \leq ρ (g) ∥ ψ ∥_{2} ∥ ϕ ∥_{2}

n \to \infty \underline{lim} \frac{M _{n} ( x , y )}{lo g n} \geq \frac{2}{H _{2} ( μ )} for μ_{ω} \otimes μ_{ω} -almost every (x, y) \in E_{ω} \times E_{ω} .

n \to \infty \underline{lim} \frac{M _{n} ( x , y )}{lo g n} \geq \frac{2}{H _{2} ( μ )} for μ_{ω} \otimes μ_{ω} -almost every (x, y) \in E_{ω} \times E_{ω} .

n \to \infty lim \frac{M _{n} ( x , y )}{lo g n} = \frac{2}{H _{2} ( μ )} for μ_{ω} \otimes μ_{ω} -almost every (x, y) \in E_{ω} \times E_{ω} .

n \to \infty lim \frac{M _{n} ( x , y )}{lo g n} = \frac{2}{H _{2} ( μ )} for μ_{ω} \otimes μ_{ω} -almost every (x, y) \in E_{ω} \times E_{ω} .

μ (A \cap σ^{- g - n} B) - μ (A) μ (B) \leq α (g) μ (A);

μ (A \cap σ^{- g - n} B) - μ (A) μ (B) \leq α (g) μ (A);

μ_{ω} (A \cap σ^{- g - n} B) - μ_{ω} (A) μ_{θ^{n + g} ω} (B) \leq α (g) μ (A),

μ_{ω} (A \cap σ^{- g - n} B) - μ_{ω} (A) μ_{θ^{n + g} ω} (B) \leq α (g) μ (A),

\int_{Ω} ψ . ϕ \circ θ^{n} . φ \circ θ^{n + m} d P - \int_{Ω} ψ d P \int_{Ω} ϕ d P \int_{Ω} φ d P \leq ∥ ψ ∥_{B} ∥ ϕ ∥_{B} ∥ φ ∥_{B} ρ (min (n, m))

\int_{Ω} ψ . ϕ \circ θ^{n} . φ \circ θ^{n + m} d P - \int_{Ω} ψ d P \int_{Ω} ϕ d P \int_{Ω} φ d P \leq ∥ ψ ∥_{B} ∥ ϕ ∥_{B} ∥ φ ∥_{B} ρ (min (n, m))

∥ ψ_{1} ∥_{B} \leq ξ^{n} and ∥ ψ_{2} ∥_{B} \leq ξ^{n} .

∥ ψ_{1} ∥_{B} \leq ξ^{n} and ∥ ψ_{2} ∥_{B} \leq ξ^{n} .

\int_{Ω} ψ . ϕ \circ θ^{n} . φ \circ θ^{n + m} . υ \circ θ^{n + m + l} d P - \int_{Ω} ψ d P \int_{Ω} ϕ d P \int_{Ω} φ d P \int_{Ω} υ d P \leq ∥ ψ ∥_{B} ∥ ϕ ∥_{B} ∥ φ ∥_{B} ∥ υ ∥_{B} ρ (min (n, m, l))

\int_{Ω} ψ . ϕ \circ θ^{n} . φ \circ θ^{n + m} . υ \circ θ^{n + m + l} d P - \int_{Ω} ψ d P \int_{Ω} ϕ d P \int_{Ω} φ d P \int_{Ω} υ d P \leq ∥ ψ ∥_{B} ∥ ϕ ∥_{B} ∥ φ ∥_{B} ∥ υ ∥_{B} ρ (min (n, m, l))

μ_{ω} ([x_{0} \dots x_{n}]) = p_{x_{0}} (ω) p_{x_{1}} (θ ω) \dots p_{x_{n}} (θ^{n} ω) .

μ_{ω} ([x_{0} \dots x_{n}]) = p_{x_{0}} (ω) p_{x_{1}} (θ ω) \dots p_{x_{n}} (θ^{n} ω) .

μ_{ω} (A \cap σ^{- g - n} B) - μ_{ω} (A) μ_{θ^{n + g} ω} (B) = 0

μ_{ω} (A \cap σ^{- g - n} B) - μ_{ω} (A) μ_{θ^{n + g} ω} (B) = 0

μ ([x_{0} \dots x_{n}])

μ ([x_{0} \dots x_{n}])

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Longest common substring for random subshifts of finite type

Jérôme Rousseaulabel=e3][email protected]=u2 [[

url]www.sd.mat.ufba.br/~jerome.rousseau

Universidade do Porto and Universidade Federal da Bahia

Departamento de Matemática,

Faculdade de Ciências da Universidade do Porto,

Rua do Campo Alegre, 687,

4169-007 Porto, Portugal

Departamento de Matemática,

Universidade Federal da Bahia,

Av. Ademar de Barros s/n,

40170-110 Salvador, Brazil

Abstract

In this paper, we study the behaviour of the longest common substring for random subshifts of finite type (for dynamicists) or of the longest common substring for random sequences in random environments (for probabilists). We prove that, under some exponential mixing assumptions, this behaviour is linked to the Rényi entropy of the stationary measure. We emphasize that what we establish is a quenched result.

Résumé

Dans cet article, nous étudions le comportement de la plus longue sous-chaîne commune pour des sous-shifts aléatoires de type fini (pour les dynamiciens) ou de la plus longue sous-chaîne commune pour des suites aléatoires en milieux aléatoires (pour les probabilistes). Nous prouvons que, sous des hypothéses de mélange exponentiel, ce comportement est lié à l’entropie de Rényi de la mesure stationnaire. Nous soulignons que ce que nous établissons est un résultat fibré.

Longest common substring, Rényi entropy, random dynamical systems, random sequences in random environments, string matching,

60F15, 60K37, 37A50, 37A25, 37Hxx, 94A17, 92D20,

keywords:

[class=MSC]

T1This work was partially supported by CNPq, by FCT project PTDC/MAT-PUR/28177/2017, with national funds, and by CMUP (UIDB/00144/2020), which is funded by FCT with national (MCTES) and European structural funds through the programs FEDER, under the partnership agreement PT2020.

Introduction

To try and measure the similarity between sequences, one has to develop computational tools to compare the sequences (and to optimize the algorithm) and probabilistic tools to discern the significance of the relationship. Thus sequences comparison (and in particular sequences alignment and sequences matching) takes its roots in computer science and probability and has applications in areas as diverse as bioinformatics, geology, linguistics or social sciences. We refer the reader to [31, 39] for a broad introduction to sequences comparison (with a particular attention to biology).

One particularly relevant object in DNA comparison is the longest common substring, i.e. the longest string of DNA which appears in two (or more) strands. For example, for the following two strands

[TABLE]

a longest common substring is ACAA (TGAC is also a longest common substring) and is of length 4 when the total length of the strands is 20. A way to distinguish if this behaviour is common or rare is to obtain probabilistic results which allow us to understand the statistical significance of our comparison.

In this paper, we will concentrate on the behaviour of the length of the longest common substring when the length of the strings grows, more precisely, for two sequences $x$ and $y$ , the behaviour, when $n$ goes to infinity, of

[TABLE]

For sequences drawn randomly from the same alphabet, this problem was studied by Arratia and Waterman in [4]. More precisely, if each term of the sequences is drawn independently within some alphabet $\mathcal{A}$ with respect to some probability $\mathcal{P}$ , then they proved that for $\mathcal{P}^{\mathbb{N}}\otimes\mathcal{P}^{\mathbb{N}}$ -almost every $(x,y)\in\mathcal{A}^{\mathbb{N}}\times\mathcal{A}^{\mathbb{N}}$

[TABLE]

where $p=\sum_{a\in\mathcal{A}}\mathcal{P}(a)^{2}$ .

They also proved the same result for independent irreducible and aperiodic Markov chains on a finite alphabet, and in this case $p$ is the largest eigenvalue of the matrix $[(p_{ij})^{2}]$ (where $[p_{ij}]$ is the transition matrix).

In fact, one can observe that in both case, $-\log p$ corresponds to the Rényi entropy of $\mu$ defined (provided that it exists) by

[TABLE]

where the sums are taken over all k-cylinders. Even if the existence of the Rényi entropy is not known in general, it was computed in some particular cases: Bernoulli shift, finite state Markov chains, Gibbs measure of a Hölder-continuous potential [20] and infinite state Markov chains [11]. The existence was also proved for $\phi$ -mixing measures [27], for weakly $\psi$ -mixing processes [20] and for $\psi_{g}$ -regular processes [1, 2].

Generalizations of the work [4] to sequences of different lengths, different distributions, more than two sequences, extreme value theory for sequence matching and distributional results can be found in e.g. [5, 8, 6, 7, 21, 15, 29, 28]. In a similar direction, one can also see [12, 14] (and references therein) where the authors investigate the growth rate of the maximal overlap in a string (i.e. the growth rate of the length of the longest repeated substring). We also refer to [37, 3, 2, 25] for relatively close problems.

Recently, in [9], the results of Arratia and Waterman were generalized to $\alpha$ -mixing systems with exponential decay (and $\psi$ -mixing with polynomial decay) and it was proved that if the Rényi entropy exists then for $\mu\otimes\mu$ -almost every $(x,y)$

[TABLE]

Furthermore, it was also shown in this paper that a generalization of the longest common substring problem for dynamical systems is to study the behaviour of the shortest distance between two orbits, which is, for a dynamical system $(X,T,\mu)$ , the behaviour, when $n$ goes to infinity, of

[TABLE]

Moreover, a relation between $m_{n}$ and the correlation dimension of the invariant measure was proved.

It is natural to try and obtain the same type of results for random dynamical systems since they could model more precisely physical phenomena. For random sequences, this could correspond for example to a modification (e.g. a small perturbation) on the probability with which the letters of the alphabet are drawn (i.e. random sequences in random environments). For dynamical systems, this could correspond to adding some random noise or small perturbations while iterating the same transformation, or iterating different transformations drawn randomly within a family of transformations (see e.g. [24] for an introduction to random dynamical systems).

In [13], the behaviour of the longest common substring of encoded sequences (and of the shortest distance between observed orbits) were studied and a relation with the Rényi entropy of the pushforward measure was proved. In particular, it allows the authors to obtain annealed results on the shortest distance between orbits of random dynamical systems.

Obtaining quenched results is much more delicate, in particular because generally the random maps do not have a common invariant measure. The first family of random dynamical systems to study and where one can hope to obtain results are random subshifts of finite type. Indeed, good mixing properties have been proved (see e.g. [22, 10, 23, 38]) which allows to get other statistical properties (e.g. [34, 35, 19] for the distribution of hitting times, [18] for extreme value laws). Following this idea and the setting of these papers, we study here the behaviour of the longest common substrings for random subshifts of finite type (in probabilistic language, this corresponds to the longest common substring for random sequences in random environments) and prove a link with the Rényi entropy of the stationary measure.

The paper is organized as follows. In Section 1, we will define random subshifts of finite type, explain our assumptions and give an upper bound (Theorem 2) and a lower bound (Theorem 3 and Theorem 4) for the growth rate of the longest common substring for random subshifts. In Section 2, we will apply our results to random Bernoulli shifts and random Gibbs measures. The proof of the theorems will be given in Section 3.

1 Statement of the main results

We first give the definition of a random subshift of finite type. Let $(\Omega,\theta,\mathbb{P})$ be an invertible ergodic measure preserving system, set $X=\{1,\dots,N\}^{\mathbb{N}}$ for some $N\in\mathbb{N}$ and let $\sigma:X\to X$ denote the shift. Let $b:\Omega\to\{1,\dots,N\}$ be a random variable. Let $A=\left\{A(\omega)=(a_{ij}(\omega)):\omega\in\Omega\right\}$ be a random transition matrix, i.e. for any $\omega\in\Omega$ , $A(\omega)$ is a $b(\omega)\times b(\theta\omega)$ -matrix with entries in $\{0,1\}$ , at least one non-zero entry in each row and each column and such that $\omega\mapsto a_{ij}(\omega)$ is measurable for any $i\in\mathbb{N}$ and $j\in\mathbb{N}$ . For any $\omega\in\Omega$ define the subset of the integers $X_{\omega}=\{1,\ldots,b(\omega)\}$ and

[TABLE]

We consider the random dynamical system coded by the skew-product $S:\mathcal{E}\to\mathcal{E}$ given by $S(\omega,x)=(\theta\omega,\sigma x)$ . Let $\nu$ be an $S$ -invariant probability measure with marginal $\mathbb{P}$ on $\Omega$ and let $(\mu_{\omega})_{\omega}$ denote its decomposition on $\mathcal{E}_{\omega}$ , that is, $d\nu(\omega,x)=d\mu_{\omega}(x)d\mathbb{P}(\omega)$ . The measures $\mu_{\omega}$ are called the sample measures. Note $\mu_{\omega}(A)=0$ if $A\cap X_{\omega}=\emptyset$ . We denote by $\mu=\int\mu_{\omega}\,d\mathbb{P}$ the marginal of $\nu$ on $X$ .

We emphasize that the sample measures are not invariant. However, since $\theta$ is invertible, by $\sigma$ -invariance of $\nu$ and almost everywhere uniqueness of the decomposition $d\nu=d\mu_{\omega}\,d\mathbb{P}$ , we get for $\mathbb{P}$ -almost every $\omega\in\Omega$ ,

[TABLE]

For $y\in X$ we denote by $C_{n}(y)=\{z\in X:y_{i}=z_{i}\text{ for all }0\leq i\leq n-1\}$ the $n$ -cylinder that contains $y$ . Set $\mathcal{F}_{0}^{n}(X)$ as the sigma-algebra in $X$ generated by all the $n$ -cylinders.

As explain in the introduction, for two sequences $x,y\in X$ , we are interested in the asymptotic behaviour of the longest common substring, that is the behaviour of

[TABLE]

We will show it is linked to the Rényi entropy of the stationary measure $\mu$ . Thus, we define the lower and upper Rényi entropies of the measure $\mu$ :

[TABLE]

where the sums are taken over all k-cylinders. When the limit exists we denote by ${H}_{2}(\mu)$ the common value.

To obtain our results, we will need information on the decay of the measure of cylinders, thus we define

[TABLE]

where the max is taken over all k-cylinders.

We will assume the following: there is a constant $a\in[0,1)$ and a function $\alpha(g)$ satisfying $\alpha(g)=\mathcal{O}(a^{g})$ such that for all $n,m$ , $A\in\mathcal{F}_{0}^{n}(X)$ and $B\in\mathcal{F}_{0}^{m}(X)$ :

(I)

the marginal measure $\mu$ satisfies

[TABLE]

(II)

(fibered exponential $\alpha$ -mixing) for $\mathbb{P}$ -almost every $\omega\in\Omega$

[TABLE]

One can observe that assumption (I) is weaker than $\alpha$ -mixing since in the intersection we only deal with the same cylinder $A$ . We recall that the measure $\mu$ is $\alpha$ -mixing if:

(I-a)

(exponential $\alpha$ -mixing) the marginal measure $\mu$ satisfies

[TABLE]

for all $m,n$ , $A\in\mathcal{F}_{0}^{n}(X)$ and $B\in\mathcal{F}_{0}^{m}(X)$ .

Before stating our results, we will consider the annealed case:

Theorem 1 (Theorem 4.4 [13]).

If $0<\underline{H}_{2}(\mu)$ , then

[TABLE]

Moreover, if hypothesis (I-a) holds, then

[TABLE]

First of all, we observe that the statement of this theorem is slightly different that the one of Theorem 4.4 in [13] since they consider more general dynamical systems and not only random subshifts of finite type. Nevertheless, one can adapt easily their results and proof to obtain the theorem as stated here.

One could wonder why the Rényi entropy of $\mu$ appears in these results (and not the Rényi entropy of $\nu$ for example). In fact, when studying $M_{n}$ , we are not interested on the behaviour of the whole orbits $S^{n}(\omega,x)$ but only its projection on $X$ (called an observation of the dynamical system). More precisely, if $\pi:\mathcal{E}\rightarrow X$ denotes the canonical projection (i.e. $\pi(\omega,x)=x$ ), we study the behaviour of the image (or observation) of the orbits, that is $\pi(S^{n}(\omega,x))$ . The idea of observing dynamical systems was developed in [33, 32] to obtain annealed results for return times in random dynamical systems and for the shortest distance between random orbits in [13]. Moreover, it was proved, that when observing dynamical systems, these quantities are linked with the dimension (or in our case the Rényi entropy) of the pushforward measure $\pi_{*}\nu$ (where $\pi_{*}\nu(.)=\nu(\pi^{-1}(.))$ ). Furthermore, in our random setting the pushforward measure $\pi_{*}\nu$ and the measure $\mu$ are equals (e.g. [32, Proof of Theorem 8]) and thus ${H}_{2}(\pi_{*}\nu)={H}_{2}(\mu)$ .

Unfortunately, these technics only give annealed results, thus, in this paper, we will use different tools to obtain quenched results.

Remark 1.

We note that since $S$ is a dynamical system, one could apply (under the right assumptions) the results of [9] to study the shortest distance between orbits $m_{n}((\omega,x),(\tilde{\omega},y))=\min_{i,j=0,\dots,n-1}\left(d(S^{i}(\omega,x),S^{j}(\tilde{\omega},y))\right)$ and link it to the correlation dimension of $\nu$ . Nevertheless, it will not give us precise informations on $M_{n}(x,y)$ since $m_{n}((\omega,x),(\tilde{\omega},y))$ takes into account the distance between elements of the orbits of $(\omega,x)$ and $(\tilde{\omega},y)$ while $M_{n}$ only considers elements of the orbits of $x$ and $y$ .

We present now the first main result of this section which gives an upper bound for the growth rate of the longest common substring.

Theorem 2.

If $0<\underline{H}_{2}(\mu)\leq 2h_{0}$ and if hypothesis (I) and (II) hold, then for $\mathbb{P}$ -almost every $\omega\in\Omega$ ,

[TABLE]

One can notice that in the deterministic case [9] and in the annealed case, no mixing assumptions are needed to obtain the upper bound. As one can see in the proof of this theorem, the main problem and difference with the deterministic case is that the sample measures are not invariant which is the main reason to use mixing to obtain the upper bound (and the lower).

Moreover, one can observe that assuming $\underline{H}_{2}(\mu)\leq 2h_{0}$ is not a too restrictive assumption. Indeed, in the deterministic case this hypothesis is always satisfied (see e.g. [20] in the proof of Theorem 1 (IV)). In the random setting, this assumption prohibits for example to have some sample measures with an extreme behaviour (relatively with the others).

To obtain a lower bound, we will need stronger assumptions: we will need $\alpha$ -mixing for the measure $\mu$ and we will require some mixing properties for the base transformation $(\Omega,\theta,\mathbb{P})$ .

First of all, we will treat the case when $(\Omega,\theta,\mathbb{P})$ is a $\rho$ -mixing two-sided shift, i.e. $\Omega=\mathcal{A}^{\mathbb{Z}}$ for some alphabet $\mathcal{A}$ , $\theta$ is the shift and:

(III)

(exponential $\rho$ -mixing) For all $n$ and for all $\psi\in L^{2}(\mathcal{F}_{-\infty}^{n}(\Omega))$ and $\phi\in L^{2}(\mathcal{F}_{0}^{\infty}(\Omega))$

[TABLE]

with $\rho(n)=\mathcal{O}(a^{n})$ .

Moreover, we will need that the sample measure $\mu_{\omega}$ of a cylinder of size $n$ does not depend on all the terms of $\omega$ :

(IV)

there exists a function $h$ with $h(n)=\mathcal{O}(n)$ such that for $\mathbb{P}$ -almost every $\omega$ and every cylinder $C\in\mathcal{F}_{0}^{n}(X)$ , the function $\omega\mapsto\mu_{\omega}(C)$ belongs to $L^{2}(\mathcal{F}_{-h(n)}^{h(n)}(\Omega))$ .

One can observe that it is quite simple to check if assumption (IV) is satisfied, however this assumption is restrictive and only enables us to work with some special family of sample measures. Nevertheless, if the system $(\Omega,\theta,\mathbb{P})$ satisfies some stronger mixing assumption we will be able to work with more general families of sample measures. Thus, after the statement of the next theorem we will give an alternative couple of assumptions which also allows us to obtain a lower bound for the growth rate of the longest common substring.

Theorem 3.

If $0<\underline{H}_{2}(\mu)\leq\overline{H}_{2}(\mu)<2h_{0}$ and if hypothesis (I-a), (II), (III) and (IV) hold, then, for $\mathbb{P}$ -almost every $\omega\in\Omega$ ,

[TABLE]

Moreover, if the Rényi entropy exists, we get for $\mathbb{P}$ -almost every $\omega\in\Omega$ ,

[TABLE]

In Section 2.1, we will apply this result to random Bernoulli shifts.

Remark 2 (Infinite alphabets).

One can observe in the proof of Theorem 3, that stronger mixing assumptions for the stationary measure and the sample measures allow us to work with infinite alphabets. More precisely, if in Theorem 3, one replaces assumptions (I-a) and (II) by

(I’) (exponential $\phi$ -mixing) the marginal measure $\mu$ satisfies

[TABLE]

and

(II’) (fibered exponential $\phi$ -mixing) for $\mathbb{P}$ -almost every $\omega\in\Omega$

[TABLE]

then the same conclusions are satisified.

To deal with more general random subshifts (and in particular random Gibbs measures in Section 2.2) we will need a stronger mixing assumption on the base $(\Omega,\theta,\mathbb{P})$ (satisfied for example for Anosov diffeomorphisms [26]):

(III’)

(exponential $3$ -mixing) There exists a Banach space $\mathcal{B}$ such that for all $\psi,\ \phi,\ \varphi\in\mathcal{B}$ , for all $n\in\mathbb{N}^{*}$ and $m\in\mathbb{N}^{*}$ , we have

[TABLE]

with $\rho(n)=\mathcal{O}(a^{n})$ and $\|.\|_{\mathcal{B}}$ is the norm in the Banach space $\mathcal{B}$ .

We are now able to replace assumption (IV) by a less restrictive assumption:

(IV’)

There exists $\xi\geq 0$ such that for every $n\in\mathbb{N}$ and every cylinder $C\in\mathcal{F}_{0}^{n}(X)$ , the functions $\psi_{1}:\omega\mapsto\mu_{\omega}(C)$ and $\psi_{2}:\omega\mapsto\max_{C_{n}}\mu_{\omega}(C_{n})$ (where the max is taken over all n-cylinders) belong to the Banach space $\mathcal{B}$ and

[TABLE]

Morever, if the base $(\Omega,\theta,\mathbb{P})$ satisfies exponential $4$ -mixing, it will allow us to weaken our mixing assumption for the marginal measure $\mu$ and use assumption (I):

(III”)

(exponential $4$ -mixing) There exists a Banach space $\mathcal{B}$ such that for all $\psi,\ \phi,\ \varphi,\ \upsilon\in\mathcal{B}$ , for all $n\in\mathbb{N}^{*}$ , $m\in\mathbb{N}^{*}$ and $l\in\mathbb{N}^{*}$ , we have

[TABLE]

with $\rho(n)=\mathcal{O}(a^{n})$ and $\|.\|_{\mathcal{B}}$ is the norm in the Banach space $\mathcal{B}$ .

In Section 2.2, we will check these assumptions for random Gibbs measures and will chose the Banach space $\mathcal{B}$ to be the space of Hölder continuous functions.

With these assumptions, we obtain the same results as in Theorem 3:

Theorem 4.

If $0<\underline{H}_{2}(\mu)\leq\overline{H}_{2}(\mu)<2h_{0}$ and if

$\bullet$ * hypothesis (I-a), (II), (III’) and (IV’) are satisfied,

or*

$\bullet$ * hypothesis (I), (II), (III”) and (IV’) are satisfied,

then the conclusions of Theorem 3 hold.*

We will now apply our results to random Bernoulli shifts and random Gibbs measures (these examples follow [34, 35], where assumptions (I-a) and (II) where proved to obtain a quenched exponential distribution of hitting times).

2 Examples

2.1 Random Bernoulli shifts

Let $s\geq 1$ and $(\Omega,\theta)$ be a subshift of finite type on the symbolic space $\{0,1,\ldots,s\}^{\mathbb{Z}}$ and let $\mathbb{P}$ be a Gibbs measure from a Hölder potential.

Let $b\geq 1$ and make the shift $\{0,1,\ldots,b\}^{\mathbb{N}}$ a random subshift by putting on it the random Bernoulli measures constructed as follows. Let $W=(w_{ij})$ be a $s\times b$ stochastic matrix with entries in $(0,1)$ . Set $p_{j}(\omega)=w_{\omega_{0},j}$ . The random Bernoulli measure $\mu_{\omega}$ is defined by

[TABLE]

First of all, hypothesis (IV) is satisfied since $\mu_{\omega}([x_{0}\dots x_{n}])$ only depends on $\omega_{0},\dots,\omega_{n}$ .

Since $\mu_{\omega}$ are Bernoulli measures, one can observe that for all $m,n$ , $A\in\mathcal{F}_{0}^{n}$ and $B\in\mathcal{F}_{0}^{m}$ :

[TABLE]

for every $g\geq 1$ and every $\omega\in\Omega$ . Thus, property (I-a) is satisfied.

Moreover, it was proved in [34] that assumption (II) is satisfied. Since the Gibbs measure $\mathbb{P}$ is exponentially $\psi$ -mixing, it is exponentially $\rho$ -mixing and (III) is satisfied. Thus, if $0<\underline{H}_{2}(\mu)\leq 2h_{0}$ one can apply Theorem 2 and if besides that $\overline{H}_{2}(\mu)<2h_{0}$ then one can apply Theorem 3.

For example, when the base is i.i.d., we can compute the Rényi entropy. Indeed

[TABLE]

Thus,

[TABLE]

and

[TABLE]

A similar computation gives us

[TABLE]

So, if $H_{2}(\mu)<2h_{0}$ , we have for $\mathbb{P}$ -almost every $\omega\in\Omega$ ,

[TABLE]

for $\mu_{\omega}\otimes\mu_{\omega}$ -almost every $(x,y)\in X\times X$ .

In this case, wether the condition $H_{2}(\mu)<2h_{0}$ holds or not can be easily checked. For example, this condition will be satisfied if the letter with the maximum weight is always the same. Indeed, assuming that it exists $b_{0}\in\{0,\dots,b\}$ such that $\max_{x_{0}}p_{x_{0}}(\omega)=p_{b_{0}}$ for every $\omega\in\Omega$ , we observe that

[TABLE]

and thus

[TABLE]

Also, the condition $H_{2}(\mu)<2h_{0}$ will be satisfied if all the letters have a relatively close probability, i.e. if it exists a constant $P$ such that $P<p_{j}(\omega)<P\sqrt{b+1}$ for every $j\in\{0,\dots,b\}$ and every $\omega\in\Omega$ . Indeed, in this case, we have

[TABLE]

and thus $H_{2}(\mu)<2h_{0}$ . This could be applied to small perturbations of a uniform Bernoulli shift, i.e., $p_{j}(\omega)=\frac{1}{b+1}+\delta_{j}(\omega)$ with $\frac{1-\sqrt{b+1}}{(b+1)(1+\sqrt{b+1})}<\delta_{j}(\omega)<\frac{\sqrt{b+1}-1}{(b+1)(1+\sqrt{b+1})}$ for every $j\in\{0,\dots,b\}$ and every $\omega\in\Omega$ (one can easily check that in this case $P<p_{j}(\omega)<P\sqrt{b+1}$ with $P=\frac{2}{(b+1)(1+\sqrt{b+1})}$ ).

2.2 Random Gibbs measures

In this section we will give details on a family of shifts which satisfy our assumptions.

We will use the approach detailed in [38] which is concerned with shifts on $\mathbb{N}$ , for example the full shift. We note that this extends a little beyond the full shift, to the so-called BIP setting.

We assume that $(\Omega,\mathbb{P},\theta)$ is an invertible measure preserving system and let $X=N^{\mathbb{N}}$ and let $\sigma:X\to X$ denote the shift. For $r\in(0,1)$ , let $d_{r}$ be the usual symbolic metric on $X$ , i.e., $d_{r}(x,y)=r^{k}$ where $x_{i}=y_{i}$ for $i=0,\ldots,k-1$ , but $x_{k}\neq y_{k}$ .

Assume that $\phi:X\times\Omega:\to\mathbb{R}$ is a function which is almost surely Hölder continuous, which is to say, for

[TABLE]

there is some $r\in(0,1)$ and $\kappa(\omega)\geq 0$ such that $\int\log\kappa\leavevmode\nobreak\ d\mathbb{P}<\infty$ where $V_{n}^{\omega}(\phi)\leq\kappa(\omega)r^{n}$ .

Define $S_{n}\phi_{\omega}(x):=\sum_{k=0}^{n-1}\phi_{\theta^{k}\omega}\circ\sigma^{k}(x)$ . If $x,y$ are in the same $m$ -cylinder for $m\geq n$ , then $|S_{n}\phi_{\omega}(x)-S_{n}\phi_{\omega}(y)|\leq r^{m-n}\sum_{k=0}^{n-1}r^{k}\kappa(\theta^{n-k}\omega)$ . As in the proof of [17, Lemma 7.2], the assumption on the integrability of $\log\kappa$ implies that the above limit is finite a.s., say $\sum_{k=0}^{n-1}r^{k}\kappa(\theta^{n-k}\omega)\leq c_{\omega}$ . However, it is also pointed out in [38] that if $\kappa$ is integrable, then we have an a.s. uniform upper bound, say $C_{\phi}$ on $\sum_{k=0}^{n-1}r^{k}\kappa(\theta^{n-k}\omega)$ . Given a Hölder function $\psi$ , then we define

[TABLE]

Now we define the random Ruelle operator by

[TABLE]

where $\psi:X^{\prime}\to[0,\infty]$ where $X^{\prime}\subset X$ is such that $\mathcal{L}_{\omega}$ is well-defined. As in [16, 38], it can be shown that there exists some constant $\lambda_{\omega}$ and some measurable function $\rho_{\omega}$ which is uniformly bounded from below, such that $\mathcal{L}_{\omega}\rho_{\omega}=\lambda_{\omega}\rho_{\theta\omega}$ a.s. and such that $\log\rho$ satisfies the same smoothness properties as $\phi$ , i.e. we have the same $\kappa$ and $r$ in the variation. This allows us to replace $\phi$ with

[TABLE]

Letting $\mathcal{L}_{\omega}$ denote the corresponding transfer operator, one consequence of this is that $\mathcal{L}_{\omega}1=1$ . Note also that random equilibrium states for $\phi$ and $\varphi$ coincide.

Now we have the property that

[TABLE]

for appropriate observables $\psi,\gamma$ .

We will make the following almost sure assumptions on our system (which are satisfied for subshifts of finite type with Hölder potentials):

$\int\kappa\leavevmode\nobreak\ d\mathbb{P}<\infty$ , so $\sum_{k=0}^{\infty}r^{k}\kappa(\theta^{n-k}\omega)$ is a.s. uniformly bounded, independently of $\omega$ . 2. 2.

There exists a measure $\mu_{\omega}$ where $\mathcal{L}_{\omega}^{*}\mu_{\omega}=\mu_{\theta^{-1}\omega}$ , i.e., (2) holds for $L^{1}$ observables. 3. 3.

Big images: there exists some $C_{BIP}>0$ such that for any $n$ -cylinder $U$ and $\omega\in\Omega$ , $\inf{\mu_{\theta^{n}\omega}(\sigma^{n}U)}>C_{BIP}$ . 4. 4.

There exist $C>0$ , and $g(n)\to 0$ as $n\to\infty$ such that

[TABLE]

Under these conditions, it was proved in [35, Proposition 6.1] that the sample measures satisfy (II).

When $\theta:\Omega\to\Omega$ is a subshift of finite type on a finite alphabet, with a Gibbs measure for a Hölder potential, it is known that assumption (III”) is satisfied with $\mathcal{B}$ being the space of Hölder continuous functions [26, 36].

For $\alpha>0$ , let the norm $\|\cdot\|_{\alpha}$ be defined by $\|\cdot\|_{\alpha}=|\cdot|_{\alpha}+|\cdot|_{\infty}$ where $|f|_{\alpha}=\sup\left\{\frac{V_{n}(f)}{\alpha^{n}}:n\geq 0\right\}$ .

It was also proved in [35, Lemma 6.2] that for any $\beta\in(0,1]$ , there exist $\alpha\in(0,1]$ and $C_{\beta}>0$ , such that for every cylinder $C$ in $\mathcal{F}_{0}^{n}(X)$ , the map $\psi_{1}:\omega\mapsto\mu_{\omega}(C)$ is $\alpha$ -Hölder and $\|\psi_{1}\|_{\alpha}\leq C_{\beta}r^{-\beta n}$ . Thus, $\|\psi_{1}\|_{\alpha}\leq\xi^{n}$ for some $\xi\geq 0$ . Moreover, since for every real-valued functions $f,g$ we have $|\max f(x)-\max g(x)|\leq\max|f(x)-g(x)|$ , we obtain that the map $\psi_{2}:\omega\mapsto\max_{C_{n}}\mu_{\omega}(C_{n})$ is $\alpha$ -Hölder and $\|\psi_{2}\|_{\alpha}\leq\xi^{n}$ . Thus, (IV’) is satisfied.

Assumption (I-a) has been proved in [35, Section 6.2]. However, our proof contains a mistake since both terms in the right-hand side of the first equation in page 149 should be with the Hölder norm. In fact, we will prove that the sample measures satisfy (I). One can observe that to obtain Theorem 2.2 in [35], (I-a) was only used in equation (4.3) and could be substituted by (I).

Following the proof of [30, Proposition 2.4], we fix our set $A\in\mathcal{F}_{0}^{n}$ and take both $w$ and $v$ to be $w(\omega)=w_{A}(\omega)=\mu_{\omega}(A)-\mu(A)$ (this normalisation by $\mu(A)$ simplifies the calculations). Note that $|w|_{\infty}\leq 1$ . Let $\gamma\in(0,\alpha)$ . For $k\in\mathbb{N}$ , we approximate $v,w$ by $v_{k},w_{k}$ , depending only on coordinates $x_{-k},\ldots,x_{0},\ldots,x_{k}$ such that $|v-v_{k}|\leq\gamma^{k}|v|_{\gamma}$ and $|w-w_{k}|\leq\gamma^{k}|w|_{\gamma}$ . So that proof yields that for $n\geq 2k$ ,

[TABLE]

So taking $k=\lfloor\frac{n+g}{3}\rfloor$ , if we choose $\gamma\in(0,\alpha)$ so that $\gamma^{\frac{1}{3}}r^{-\beta}<1$ , we obtain

[TABLE]

Moreover, we can observe that by (II)

[TABLE]

Thus, by (3) and (4), (I) is verified.

Finally, we showed that if the fiber maps satisfy conditions 1.–4. and the base transformation is a subshift of finite type on a finite alphabet with a Gibbs measure for some Hölder potential, then assumptions (I), (II), (III”) and (IV’) are satisfied. Thus, if $0<\underline{H}_{2}(\mu)\leq\overline{H}_{2}(\mu)<2h_{0}$ , one can apply Theorem 4.

3 Proofs

In this section, we will prove our theorems. Both proofs follow the line of [9] but diverge at some point since the samples measures are not invariant but satisfy (1).

Proof of Theorem 2.

For simplicity we assume $\alpha(g)=e^{-g}.$ Let $\varepsilon>0$ and define

[TABLE]

where $d$ is a constant to be chosen later.

Let us also denote

[TABLE]

and

[TABLE]

Let $\omega\in\Omega$ such that (1) is satisfied. Using Markov’s inequality we obtain

[TABLE]

Moreover, the invariance formula (1) of the sample measures gives us

[TABLE]

One can notice that, since the sample measures are not invariant, we cannot estimate the previous sum directly as in the deterministic case [9]. Thus, this is where our proof will differ and where we will use the mixing assumptions which where not necessary in the deterministic proof. First of all, using Markov’s inequality, we observe that

[TABLE]

To study the behaviour of the integral on the right hand side of the previous inequality, we will divide the sum in two terms, when $i$ and $j$ are far from one another and when they are not. Let us define $m=\gamma\log n$ where $\gamma>0$ will be chosen later.

When $i$ and $j$ are close from one another, we have, using that $\mu_{\theta^{j}\omega}$ is a probability measure and the invariance of $\mathbb{P}$

[TABLE]

When $i$ and $j$ are far from one another, we can use the mixing assumptions (I) and (II) to obtain

[TABLE]

Thus, we obtain, for $n$ large enough,

[TABLE]

where the last inequality came from the definition of $h_{0}$ and $\underline{H}_{2}(\mu)$ .

Then, choosing $d>0$ large enough and $\gamma>0$ large enough, we have, by definition of $k_{n}$ and since $\underline{H}_{2}(\mu)\leq 2h_{0}$ , that

[TABLE]

Choosing a subsequence $\{n_{\kappa}\}_{\kappa\in\mathbb{N}}$ such that $n_{\kappa}=\lceil e^{\kappa^{2}}\rceil$ we have that

[TABLE]

Since the last quantity is summable in $\kappa$ , the Borel-Cantelli lemma gives that for $\mathbb{P}$ -almost every $\omega\in\Omega$ , if $\kappa$ is large enough then

[TABLE]

Thus, this inequality together with (5) gives us that for $\mathbb{P}$ -almost every $\omega\in\Omega$ , if $\kappa$ is large enough then

[TABLE]

As previously, since the last quantity is summable in $\kappa$ , the Borel-Cantelli lemma gives that for $\mu_{\omega}\otimes\mu_{\omega}$ -almost every $(x,y)$ , if $\kappa$ is large enough then

[TABLE]

and then

[TABLE]

Finally, taking the limit superior in the previous equation and observing that $(n_{\kappa})_{\kappa}$ is increasing, $(M_{n})_{n}$ is increasing and $\underset{\kappa\rightarrow+\infty}{\lim}\frac{\log n_{\kappa}}{\log n_{\kappa+1}}=1$ , we have for $\mu_{\omega}\otimes\mu_{\omega}$ -almost every $(x,y)$

[TABLE]

Then the theorem is proved since $\varepsilon$ can be chosen arbitrarily small.

∎

Proof of Theorem 3 and Theorem 4 .

For $\varepsilon>0$ , let us define

[TABLE]

where $d$ is a constant that we will choose later.

Let $\omega\in\Omega$ such that (1) is satisfied. As in the proof of Theorem 2, we have

[TABLE]

Following the lines of the proof of Theorem 7 in [9], we have, by Chebyshev’s inequality,

[TABLE]

Thus, we need to control the variance of $S_{n}$ . First of all, we observe that

[TABLE]

We will estimate the variance dividing the sum of $\operatorname{var}(S_{n})$ into $4$ terms. Let $g=\log(n^{\beta})$ where $\beta$ is a constant that we will choose later.

For $i^{\prime}-i>g+k_{n}$ , we use the invariance formula (1) and the mixing assumption (II) to obtain:

[TABLE]

If, moreover, $j^{\prime}-j>g+k_{n}$ , using again the mixing assumption (II), we have

[TABLE]

However, if $j^{\prime}-j\leq g+k_{n}$ , we obtain:

[TABLE]

By symmetry, the case where $i^{\prime}-i\leq g+k_{n}$ and $j^{\prime}-j>g+k_{n}$ will be treated as the previous one.

Finally, when $|i-i^{\prime}|\leq g+k_{n}$ and $|j-j^{\prime}|\leq g+k_{n}$ , we have:

[TABLE]

Then, one can gather these estimates to obtain

[TABLE]

This is where the proof diverge completely from the deterministic case. Indeed, as in the proof of Theorem 2, we cannot treat directly the previous estimate (which was possible in the deterministic case) and an extra care is needed. To deal with the term with the maximum, we use Markov’s inequality to obtain

[TABLE]

Since $\overline{H}_{2}(\mu)<2h_{0}$ , one can choose $\varepsilon$ small enough such that $ne^{k_{n}(h_{0}-\varepsilon)}\leq n^{-\varepsilon}$ for every $n$ large enough.

To deal with the expectation in the denominator in (6), we will need the following lemma (which proof can be found after the proof of the theorem).

Lemma 5.

Let $\frac{3}{4}<\delta<1$ . Under the assumptions of Theorem 3 or Theorem 4, we have

[TABLE]

Thus, using this lemma with (7), we have

[TABLE]

Choosing a subsequence $\{n_{\kappa}\}_{\kappa\in\mathbb{N}}$ such that $n_{\kappa}=\lceil e^{\kappa^{2}}\rceil$ , the Borel-Cantelli lemma gives that for $\mathbb{P}$ -almost every $\omega\in\Omega$ , if $\kappa$ is large enough then

[TABLE]

and

[TABLE]

Thus, if $\kappa$ is large enough

[TABLE]

Thus, (6) together with (8) and (9) gives us that for $\mathbb{P}$ -almost every $\omega\in\Omega$ , if $\kappa$ is large enough then

[TABLE]

where the last inequality came from the definition of $\overline{H}_{2}(\mu)$ and our choice of $k_{n}$ . Finally, choosing $\beta$ large enough in the definition of $g$ and choosing $d<0$ small enough, we obtain that if $\kappa$ is large enough

[TABLE]

Since the last quantity is summable in $\kappa$ , the Borel-Cantelli lemma gives that for $\mu_{\omega}\otimes\mu_{\omega}$ -almost every $(x,y)$ , if $\kappa$ is large enough then

[TABLE]

and then

[TABLE]

Finally, using the same arguments as in the proof of Theorem 2, we have for $\mathbb{P}$ -almost every $\omega$

[TABLE]

for $\mu_{\omega}\otimes\mu_{\omega}$ -almost every $(x,y)$ .

Then the theorems are proved since $\varepsilon$ can be chosen arbitrarily small. ∎

Proof of Lemma 5.

As in the previous proof, we take $k_{n}=\lfloor\frac{1}{\overline{H}_{2}(\mu)+\varepsilon}(2\log n+d\log\log n)\rfloor$ and $g=\log(n^{\beta})$ where $d<0$ and $\beta>0$ are constants to be chosen later.

First of all, we use Markov’s inequality

[TABLE]

Firstly, we will treat the last term on the previous numerator, using the mixing assumptions (I) and (II)

[TABLE]

To get an estimate on (10), we need to study the term $\int\mathbb{E}_{\omega}(S_{n})^{2}d\mathbb{P}$ . One can observe that

[TABLE]

We will separate the study of this integral depending on the relative distance and position between $i,j,i^{\prime}$ and $j^{\prime}$ and consider 5 different cases.

Case 1: $i,j,i^{\prime}$ and $j^{\prime}$ are all far from one another, i.e. at least at a distance greater that $g+k_{n}$ . We will assume that $i<j<i^{\prime}<j^{\prime}$ (when the relative position is different, everything can be done identically because of the symmetry) and that $j-i>g+k_{n}$ , $i^{\prime}-j>g+k_{n}$ , $j^{\prime}-i^{\prime}>g+k_{n}$ . Using the mixing assumptions (I-a) and (II) (a similar estimate is obtained when (III”) is satisfied) we obtain

[TABLE]

Case 2: only two indices are close. We will assume that $i\leq j\leq i^{\prime}\leq j^{\prime}$ and that $j-i>g+k_{n}$ , $j-i^{\prime}>g+k_{n}$ , $j^{\prime}-i^{\prime}\leq g+k_{n}$ . Since the cylinders form a partition and that the sample measures are probability measures, we have

[TABLE]

When the indices are in a different position and/or the two close indices are not $j^{\prime}$ and $i^{\prime}$ , the same idea can be used. However, one need to choose carefully with which index to take the maximum so that one index disappears with one sum and we obtain a similar term as (13) where the 3 remaining indices are far from each other. Then, we use the mixing assumptions (III’) and (IV’) (a similar estimate is obtained when (III) and (IV) are satisfied) to get

[TABLE]

Case 3: three indices are close and one is far from them. We will assume that $i\leq j\leq i^{\prime}\leq j^{\prime}$ and that $j-i\leq g+k_{n}$ , $j-i^{\prime}\leq g+k_{n}$ , $j^{\prime}-i^{\prime}>g+k_{n}$ . Since $\mu_{\theta^{j}\omega}\left(C_{k_{n}}\right)\leq 1$ and $\mu_{\theta^{i}\omega}$ is a probability measure we have

[TABLE]

When the indices are in a different position, one can use the same idea so that we stay with two indices which are far from each other and measure the same cylinder. Thus we can use the mixing assumptions (I) an (II), to obtain

[TABLE]

Case 4: two indices are close and both are far from the two other indices which are close from one another. We will assume that $i\leq j\leq i^{\prime}\leq j^{\prime}$ and that $j-i\leq g+k_{n}$ , $j-i^{\prime}>g+k_{n}$ , $j^{\prime}-i^{\prime}\leq g+k_{n}$ . Since the sample measures are probability measures, we obtain

[TABLE]

For the other relative positions, we can observe that

•

if the measures with the two indices that are far from each other measure different cylinders, we obtain an estimate similar to (16);

•

if the measures with the two indices that are far from each other measure the same cylinder, the case can be treat as case 3.

Then, using the mixing assumptions (III’) and (IV’) (a similar estimate is obtained when (III) and (IV) are satisfied), we have

[TABLE]

Case 5: all the indices are close. We will assume that $i\leq j\leq i^{\prime}\leq j^{\prime}$ and that $j-i\leq g+k_{n}$ , $j-i^{\prime}\leq g+k_{n}$ , $j^{\prime}-i^{\prime}\leq g+k_{n}$ . In this case, the relative position is irrelevant. Since the sample measures are probability measures, we obtain

[TABLE]

Finally, (10) together with (11), (12), (14), (15), (17) and (18) gives us that there exists a constant $c_{1}$ such that

[TABLE]

where

[TABLE]

We recall that $k_{n}=\frac{1}{\overline{H}_{2}(\mu)+\varepsilon}(2\log n+d\log\log n)$ and that $g=\log(n^{\beta})$ . Thus, as in (7), we have

[TABLE]

for every $n$ large enough.

Moreover, by definition of $\overline{H}_{2}(\mu)$ and our choice of $k_{n}$ , we have for every $n$ large enough

[TABLE]

First of all, we choose $\beta\gg 1$ so that for any n large enough

[TABLE]

Then, for the first term in (19), we have

[TABLE]

For the second term in (19), we have

[TABLE]

For the third term in (19), we have

[TABLE]

For the fourth term in (19), we have

[TABLE]

And, for the fifth term in (19), we have

[TABLE]

Finally, putting all these estimates together in (19), choosing $d\ll-1$ and since $3/4<\delta<1$ , we obtain

[TABLE]

∎

Acknowledgements: The author would like to thank Rodrigo Lambert for various comments on a first draft of the paper, Mike Todd for fruitful discussions and for fixing the mistake found in [35] and the referee for useful suggestions to improve the paper.

Bibliography39

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Abadi and L. Cardeno, Renyi entropies and large deviations for the first-match function, IEEE Trans. Inf. Theory(61), 4 (2015), 1629–1639.
2[2] M. Abadi and R. Lambert, From the divergence between two measures to the shortest path between two observables, Ergod. Theory Dyn., 39 (2019), no. 7, 1729–1744.
3[3] M. Abadi and N. Vergne, Poisson approximation for search of rare words in DNA sequences , ALEA Lat. Am. J. Probab. Math. Stat. 4 (2008), 223–244.
4[4] R. Arratia and M. Waterman, An Erdös-Rényi Law with Shifts, Adv. Math. 55 (1985), 13-23.
5[5] R. Arratia and M. Waterman, Critical phenomena in sequence matching , Ann. Probab., 13 (1985), no. 4, 1236–1249.
6[6] R. Arratia and M. Waterman, The Erdös-Rényi strong law for pattern matching with a given proportion of mismatches , Ann. Probab., 17 (1989), no. 3, 1152–1169.
7[7] R. Arratia and M. Waterman, A phase transition for the score in matching random sequences allowing deletions , Ann. Appl. Probab., 4 (1994), no. 1, 200–225.
8[8] R. Arratia, L. Gordon and M. Waterman, An extreme value theory for sequence matching , Ann. Statist., 14 (1986), no. 3, 971–993.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Longest common substring for random subshifts of finite type

Abstract

Résumé

keywords:

keywords:

Introduction

1 Statement of the main results

Theorem 1** (Theorem 4.4 [13]).**

Remark 1**.**

Theorem 2**.**

Theorem 3**.**

Remark 2** (Infinite alphabets).**

Theorem 4**.**

2 Examples

2.1 Random Bernoulli shifts

2.2 Random Gibbs measures

3 Proofs

Proof of Theorem 2.

Proof of Theorem 3 and Theorem 4 .

Lemma 5**.**

Proof of Lemma 5.

Theorem 1 (Theorem 4.4 [13]).

Remark 1.

Theorem 2.

Theorem 3.

Remark 2 (Infinite alphabets).

Theorem 4.

Lemma 5.