Longest common substring for random subshifts of finite type
Jerome Rousseau

TL;DR
This paper investigates the behavior of the longest common substring in random subshifts of finite type and random sequences, linking it to Rènyi entropy under exponential mixing conditions, with a focus on quenched results.
Contribution
It establishes a connection between the longest common substring behavior and Rènyi entropy in random subshifts, providing quenched results under mixing assumptions.
Findings
Behavior linked to Rènyi entropy
Results hold under exponential mixing
Focus on quenched analysis
Abstract
In this paper, we study the behaviour of the longest common substring for random subshifts of finite type (for dynamicists) or of the longest common substring for random sequences in random environments (for probabilists). We prove that, under some exponential mixing assumptions, this behaviour is linked to the R\'enyi entropy of the stationary measure. We emphasize that what we establish is a quenched result.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Longest common substring for random subshifts of finite type
Jérôme Rousseaulabel=e3][email protected]=u2 [[
url]www.sd.mat.ufba.br/~jerome.rousseau
Universidade do Porto and Universidade Federal da Bahia
Departamento de Matemática,
Faculdade de Ciências da Universidade do Porto,
Rua do Campo Alegre, 687,
4169-007 Porto, Portugal
Departamento de Matemática,
Universidade Federal da Bahia,
Av. Ademar de Barros s/n,
40170-110 Salvador, Brazil
Abstract
In this paper, we study the behaviour of the longest common substring for random subshifts of finite type (for dynamicists) or of the longest common substring for random sequences in random environments (for probabilists). We prove that, under some exponential mixing assumptions, this behaviour is linked to the Rényi entropy of the stationary measure. We emphasize that what we establish is a quenched result.
Résumé
Dans cet article, nous étudions le comportement de la plus longue sous-chaîne commune pour des sous-shifts aléatoires de type fini (pour les dynamiciens) ou de la plus longue sous-chaîne commune pour des suites aléatoires en milieux aléatoires (pour les probabilistes). Nous prouvons que, sous des hypothéses de mélange exponentiel, ce comportement est lié à l’entropie de Rényi de la mesure stationnaire. Nous soulignons que ce que nous établissons est un résultat fibré.
Longest common substring, Rényi entropy, random dynamical systems, random sequences in random environments, string matching,
60F15, 60K37, 37A50, 37A25, 37Hxx, 94A17, 92D20,
keywords:
keywords:
[class=MSC]
T1This work was partially supported by CNPq, by FCT project PTDC/MAT-PUR/28177/2017, with national funds, and by CMUP (UIDB/00144/2020), which is funded by FCT with national (MCTES) and European structural funds through the programs FEDER, under the partnership agreement PT2020.
Introduction
To try and measure the similarity between sequences, one has to develop computational tools to compare the sequences (and to optimize the algorithm) and probabilistic tools to discern the significance of the relationship. Thus sequences comparison (and in particular sequences alignment and sequences matching) takes its roots in computer science and probability and has applications in areas as diverse as bioinformatics, geology, linguistics or social sciences. We refer the reader to [31, 39] for a broad introduction to sequences comparison (with a particular attention to biology).
One particularly relevant object in DNA comparison is the longest common substring, i.e. the longest string of DNA which appears in two (or more) strands. For example, for the following two strands
[TABLE]
[TABLE]
a longest common substring is ACAA (TGAC is also a longest common substring) and is of length 4 when the total length of the strands is 20. A way to distinguish if this behaviour is common or rare is to obtain probabilistic results which allow us to understand the statistical significance of our comparison.
In this paper, we will concentrate on the behaviour of the length of the longest common substring when the length of the strings grows, more precisely, for two sequences and , the behaviour, when goes to infinity, of
[TABLE]
For sequences drawn randomly from the same alphabet, this problem was studied by Arratia and Waterman in [4]. More precisely, if each term of the sequences is drawn independently within some alphabet with respect to some probability , then they proved that for -almost every
[TABLE]
where .
They also proved the same result for independent irreducible and aperiodic Markov chains on a finite alphabet, and in this case is the largest eigenvalue of the matrix (where is the transition matrix).
In fact, one can observe that in both case, corresponds to the Rényi entropy of defined (provided that it exists) by
[TABLE]
where the sums are taken over all k-cylinders. Even if the existence of the Rényi entropy is not known in general, it was computed in some particular cases: Bernoulli shift, finite state Markov chains, Gibbs measure of a Hölder-continuous potential [20] and infinite state Markov chains [11]. The existence was also proved for -mixing measures [27], for weakly -mixing processes [20] and for -regular processes [1, 2].
Generalizations of the work [4] to sequences of different lengths, different distributions, more than two sequences, extreme value theory for sequence matching and distributional results can be found in e.g. [5, 8, 6, 7, 21, 15, 29, 28]. In a similar direction, one can also see [12, 14] (and references therein) where the authors investigate the growth rate of the maximal overlap in a string (i.e. the growth rate of the length of the longest repeated substring). We also refer to [37, 3, 2, 25] for relatively close problems.
Recently, in [9], the results of Arratia and Waterman were generalized to -mixing systems with exponential decay (and -mixing with polynomial decay) and it was proved that if the Rényi entropy exists then for -almost every
[TABLE]
Furthermore, it was also shown in this paper that a generalization of the longest common substring problem for dynamical systems is to study the behaviour of the shortest distance between two orbits, which is, for a dynamical system , the behaviour, when goes to infinity, of
[TABLE]
Moreover, a relation between and the correlation dimension of the invariant measure was proved.
It is natural to try and obtain the same type of results for random dynamical systems since they could model more precisely physical phenomena. For random sequences, this could correspond for example to a modification (e.g. a small perturbation) on the probability with which the letters of the alphabet are drawn (i.e. random sequences in random environments). For dynamical systems, this could correspond to adding some random noise or small perturbations while iterating the same transformation, or iterating different transformations drawn randomly within a family of transformations (see e.g. [24] for an introduction to random dynamical systems).
In [13], the behaviour of the longest common substring of encoded sequences (and of the shortest distance between observed orbits) were studied and a relation with the Rényi entropy of the pushforward measure was proved. In particular, it allows the authors to obtain annealed results on the shortest distance between orbits of random dynamical systems.
Obtaining quenched results is much more delicate, in particular because generally the random maps do not have a common invariant measure. The first family of random dynamical systems to study and where one can hope to obtain results are random subshifts of finite type. Indeed, good mixing properties have been proved (see e.g. [22, 10, 23, 38]) which allows to get other statistical properties (e.g. [34, 35, 19] for the distribution of hitting times, [18] for extreme value laws). Following this idea and the setting of these papers, we study here the behaviour of the longest common substrings for random subshifts of finite type (in probabilistic language, this corresponds to the longest common substring for random sequences in random environments) and prove a link with the Rényi entropy of the stationary measure.
The paper is organized as follows. In Section 1, we will define random subshifts of finite type, explain our assumptions and give an upper bound (Theorem 2) and a lower bound (Theorem 3 and Theorem 4) for the growth rate of the longest common substring for random subshifts. In Section 2, we will apply our results to random Bernoulli shifts and random Gibbs measures. The proof of the theorems will be given in Section 3.
1 Statement of the main results
We first give the definition of a random subshift of finite type. Let be an invertible ergodic measure preserving system, set for some and let denote the shift. Let be a random variable. Let be a random transition matrix, i.e. for any , is a -matrix with entries in , at least one non-zero entry in each row and each column and such that is measurable for any and . For any define the subset of the integers and
[TABLE]
[TABLE]
We consider the random dynamical system coded by the skew-product given by . Let be an -invariant probability measure with marginal on and let denote its decomposition on , that is, . The measures are called the sample measures. Note if . We denote by the marginal of on .
We emphasize that the sample measures are not invariant. However, since is invertible, by -invariance of and almost everywhere uniqueness of the decomposition , we get for -almost every ,
[TABLE]
For we denote by the -cylinder that contains . Set as the sigma-algebra in generated by all the -cylinders.
As explain in the introduction, for two sequences , we are interested in the asymptotic behaviour of the longest common substring, that is the behaviour of
[TABLE]
We will show it is linked to the Rényi entropy of the stationary measure . Thus, we define the lower and upper Rényi entropies of the measure :
[TABLE]
where the sums are taken over all k-cylinders. When the limit exists we denote by the common value.
To obtain our results, we will need information on the decay of the measure of cylinders, thus we define
[TABLE]
where the max is taken over all k-cylinders.
We will assume the following: there is a constant and a function satisfying such that for all , and :
- (I)
the marginal measure satisfies
[TABLE]
- (II)
(fibered exponential -mixing) for -almost every
[TABLE]
One can observe that assumption (I) is weaker than -mixing since in the intersection we only deal with the same cylinder . We recall that the measure is -mixing if:
- (I-a)
(exponential -mixing) the marginal measure satisfies
[TABLE]
for all , and .
Before stating our results, we will consider the annealed case:
Theorem 1** (Theorem 4.4 [13]).**
If , then
[TABLE]
Moreover, if hypothesis (I-a) holds, then
[TABLE]
First of all, we observe that the statement of this theorem is slightly different that the one of Theorem 4.4 in [13] since they consider more general dynamical systems and not only random subshifts of finite type. Nevertheless, one can adapt easily their results and proof to obtain the theorem as stated here.
One could wonder why the Rényi entropy of appears in these results (and not the Rényi entropy of for example). In fact, when studying , we are not interested on the behaviour of the whole orbits but only its projection on (called an observation of the dynamical system). More precisely, if denotes the canonical projection (i.e. ), we study the behaviour of the image (or observation) of the orbits, that is . The idea of observing dynamical systems was developed in [33, 32] to obtain annealed results for return times in random dynamical systems and for the shortest distance between random orbits in [13]. Moreover, it was proved, that when observing dynamical systems, these quantities are linked with the dimension (or in our case the Rényi entropy) of the pushforward measure (where ). Furthermore, in our random setting the pushforward measure and the measure are equals (e.g. [32, Proof of Theorem 8]) and thus .
Unfortunately, these technics only give annealed results, thus, in this paper, we will use different tools to obtain quenched results.
Remark 1**.**
We note that since is a dynamical system, one could apply (under the right assumptions) the results of [9] to study the shortest distance between orbits and link it to the correlation dimension of . Nevertheless, it will not give us precise informations on since takes into account the distance between elements of the orbits of and while only considers elements of the orbits of and .
We present now the first main result of this section which gives an upper bound for the growth rate of the longest common substring.
Theorem 2**.**
If and if hypothesis (I) and (II) hold, then for -almost every ,
[TABLE]
One can notice that in the deterministic case [9] and in the annealed case, no mixing assumptions are needed to obtain the upper bound. As one can see in the proof of this theorem, the main problem and difference with the deterministic case is that the sample measures are not invariant which is the main reason to use mixing to obtain the upper bound (and the lower).
Moreover, one can observe that assuming is not a too restrictive assumption. Indeed, in the deterministic case this hypothesis is always satisfied (see e.g. [20] in the proof of Theorem 1 (IV)). In the random setting, this assumption prohibits for example to have some sample measures with an extreme behaviour (relatively with the others).
To obtain a lower bound, we will need stronger assumptions: we will need -mixing for the measure and we will require some mixing properties for the base transformation .
First of all, we will treat the case when is a -mixing two-sided shift, i.e. for some alphabet , is the shift and:
- (III)
(exponential -mixing) For all and for all and
[TABLE]
with .
Moreover, we will need that the sample measure of a cylinder of size does not depend on all the terms of :
- (IV)
there exists a function with such that for -almost every and every cylinder , the function belongs to .
One can observe that it is quite simple to check if assumption (IV) is satisfied, however this assumption is restrictive and only enables us to work with some special family of sample measures. Nevertheless, if the system satisfies some stronger mixing assumption we will be able to work with more general families of sample measures. Thus, after the statement of the next theorem we will give an alternative couple of assumptions which also allows us to obtain a lower bound for the growth rate of the longest common substring.
Theorem 3**.**
If and if hypothesis (I-a), (II), (III) and (IV) hold, then, for -almost every ,
[TABLE]
Moreover, if the Rényi entropy exists, we get for -almost every ,
[TABLE]
In Section 2.1, we will apply this result to random Bernoulli shifts.
Remark 2** (Infinite alphabets).**
One can observe in the proof of Theorem 3, that stronger mixing assumptions for the stationary measure and the sample measures allow us to work with infinite alphabets. More precisely, if in Theorem 3, one replaces assumptions (I-a) and (II) by
(I’) (exponential -mixing) the marginal measure satisfies
[TABLE]
and
(II’) (fibered exponential -mixing) for -almost every
[TABLE]
then the same conclusions are satisified.
To deal with more general random subshifts (and in particular random Gibbs measures in Section 2.2) we will need a stronger mixing assumption on the base (satisfied for example for Anosov diffeomorphisms [26]):
- (III’)
(exponential -mixing) There exists a Banach space such that for all , for all and , we have
[TABLE]
with and is the norm in the Banach space .
We are now able to replace assumption (IV) by a less restrictive assumption:
- (IV’)
There exists such that for every and every cylinder , the functions and (where the max is taken over all n-cylinders) belong to the Banach space and
[TABLE]
Morever, if the base satisfies exponential -mixing, it will allow us to weaken our mixing assumption for the marginal measure and use assumption (I):
- (III”)
(exponential -mixing) There exists a Banach space such that for all , for all , and , we have
[TABLE]
with and is the norm in the Banach space .
In Section 2.2, we will check these assumptions for random Gibbs measures and will chose the Banach space to be the space of Hölder continuous functions.
With these assumptions, we obtain the same results as in Theorem 3:
Theorem 4**.**
If and if
* hypothesis (I-a), (II), (III’) and (IV’) are satisfied,
or*
* hypothesis (I), (II), (III”) and (IV’) are satisfied,
then the conclusions of Theorem 3 hold.*
We will now apply our results to random Bernoulli shifts and random Gibbs measures (these examples follow [34, 35], where assumptions (I-a) and (II) where proved to obtain a quenched exponential distribution of hitting times).
2 Examples
2.1 Random Bernoulli shifts
Let and be a subshift of finite type on the symbolic space and let be a Gibbs measure from a Hölder potential.
Let and make the shift a random subshift by putting on it the random Bernoulli measures constructed as follows. Let be a stochastic matrix with entries in . Set . The random Bernoulli measure is defined by
[TABLE]
First of all, hypothesis (IV) is satisfied since only depends on .
Since are Bernoulli measures, one can observe that for all , and :
[TABLE]
for every and every . Thus, property (I-a) is satisfied.
Moreover, it was proved in [34] that assumption (II) is satisfied. Since the Gibbs measure is exponentially -mixing, it is exponentially -mixing and (III) is satisfied. Thus, if one can apply Theorem 2 and if besides that then one can apply Theorem 3.
For example, when the base is i.i.d., we can compute the Rényi entropy. Indeed
[TABLE]
Thus,
[TABLE]
and
[TABLE]
A similar computation gives us
[TABLE]
So, if , we have for -almost every ,
[TABLE]
for -almost every .
In this case, wether the condition holds or not can be easily checked. For example, this condition will be satisfied if the letter with the maximum weight is always the same. Indeed, assuming that it exists such that for every , we observe that
[TABLE]
and thus
[TABLE]
Also, the condition will be satisfied if all the letters have a relatively close probability, i.e. if it exists a constant such that for every and every . Indeed, in this case, we have
[TABLE]
and thus . This could be applied to small perturbations of a uniform Bernoulli shift, i.e., with for every and every (one can easily check that in this case with ).
2.2 Random Gibbs measures
In this section we will give details on a family of shifts which satisfy our assumptions.
We will use the approach detailed in [38] which is concerned with shifts on , for example the full shift. We note that this extends a little beyond the full shift, to the so-called BIP setting.
We assume that is an invertible measure preserving system and let and let denote the shift. For , let be the usual symbolic metric on , i.e., where for , but .
Assume that is a function which is almost surely Hölder continuous, which is to say, for
[TABLE]
there is some and such that where .
Define . If are in the same -cylinder for , then . As in the proof of [17, Lemma 7.2], the assumption on the integrability of implies that the above limit is finite a.s., say . However, it is also pointed out in [38] that if is integrable, then we have an a.s. uniform upper bound, say on . Given a Hölder function , then we define
[TABLE]
Now we define the random Ruelle operator by
[TABLE]
where where is such that is well-defined. As in [16, 38], it can be shown that there exists some constant and some measurable function which is uniformly bounded from below, such that a.s. and such that satisfies the same smoothness properties as , i.e. we have the same and in the variation. This allows us to replace with
[TABLE]
Letting denote the corresponding transfer operator, one consequence of this is that . Note also that random equilibrium states for and coincide.
Now we have the property that
[TABLE]
for appropriate observables .
We will make the following almost sure assumptions on our system (which are satisfied for subshifts of finite type with Hölder potentials):
, so is a.s. uniformly bounded, independently of . 2. 2.
There exists a measure where , i.e., (2) holds for observables. 3. 3.
Big images: there exists some such that for any -cylinder and , . 4. 4.
There exist , and as such that
[TABLE]
Under these conditions, it was proved in [35, Proposition 6.1] that the sample measures satisfy (II).
When is a subshift of finite type on a finite alphabet, with a Gibbs measure for a Hölder potential, it is known that assumption (III”) is satisfied with being the space of Hölder continuous functions [26, 36].
For , let the norm be defined by where .
It was also proved in [35, Lemma 6.2] that for any , there exist and , such that for every cylinder in , the map is -Hölder and . Thus, for some . Moreover, since for every real-valued functions we have , we obtain that the map is -Hölder and . Thus, (IV’) is satisfied.
Assumption (I-a) has been proved in [35, Section 6.2]. However, our proof contains a mistake since both terms in the right-hand side of the first equation in page 149 should be with the Hölder norm. In fact, we will prove that the sample measures satisfy (I). One can observe that to obtain Theorem 2.2 in [35], (I-a) was only used in equation (4.3) and could be substituted by (I).
Following the proof of [30, Proposition 2.4], we fix our set and take both and to be (this normalisation by simplifies the calculations). Note that . Let . For , we approximate by , depending only on coordinates such that and . So that proof yields that for ,
[TABLE]
So taking , if we choose so that , we obtain
[TABLE]
Moreover, we can observe that by (II)
[TABLE]
Thus, by (3) and (4), (I) is verified.
Finally, we showed that if the fiber maps satisfy conditions 1.–4. and the base transformation is a subshift of finite type on a finite alphabet with a Gibbs measure for some Hölder potential, then assumptions (I), (II), (III”) and (IV’) are satisfied. Thus, if , one can apply Theorem 4.
3 Proofs
In this section, we will prove our theorems. Both proofs follow the line of [9] but diverge at some point since the samples measures are not invariant but satisfy (1).
Proof of Theorem 2.
For simplicity we assume Let and define
[TABLE]
where is a constant to be chosen later.
Let us also denote
[TABLE]
and
[TABLE]
Let such that (1) is satisfied. Using Markov’s inequality we obtain
[TABLE]
Moreover, the invariance formula (1) of the sample measures gives us
[TABLE]
One can notice that, since the sample measures are not invariant, we cannot estimate the previous sum directly as in the deterministic case [9]. Thus, this is where our proof will differ and where we will use the mixing assumptions which where not necessary in the deterministic proof. First of all, using Markov’s inequality, we observe that
[TABLE]
To study the behaviour of the integral on the right hand side of the previous inequality, we will divide the sum in two terms, when and are far from one another and when they are not. Let us define where will be chosen later.
When and are close from one another, we have, using that is a probability measure and the invariance of
[TABLE]
When and are far from one another, we can use the mixing assumptions (I) and (II) to obtain
[TABLE]
Thus, we obtain, for large enough,
[TABLE]
where the last inequality came from the definition of and .
Then, choosing large enough and large enough, we have, by definition of and since , that
[TABLE]
Choosing a subsequence such that we have that
[TABLE]
Since the last quantity is summable in , the Borel-Cantelli lemma gives that for -almost every , if is large enough then
[TABLE]
Thus, this inequality together with (5) gives us that for -almost every , if is large enough then
[TABLE]
As previously, since the last quantity is summable in , the Borel-Cantelli lemma gives that for -almost every , if is large enough then
[TABLE]
and then
[TABLE]
Finally, taking the limit superior in the previous equation and observing that is increasing, is increasing and , we have for -almost every
[TABLE]
Then the theorem is proved since can be chosen arbitrarily small.
∎
Proof of Theorem 3 and Theorem 4 .
For , let us define
[TABLE]
where is a constant that we will choose later.
Let such that (1) is satisfied. As in the proof of Theorem 2, we have
[TABLE]
Following the lines of the proof of Theorem 7 in [9], we have, by Chebyshev’s inequality,
[TABLE]
Thus, we need to control the variance of . First of all, we observe that
[TABLE]
We will estimate the variance dividing the sum of into terms. Let where is a constant that we will choose later.
For , we use the invariance formula (1) and the mixing assumption (II) to obtain:
[TABLE]
If, moreover, , using again the mixing assumption (II), we have
[TABLE]
However, if , we obtain:
[TABLE]
By symmetry, the case where and will be treated as the previous one.
Finally, when and , we have:
[TABLE]
Then, one can gather these estimates to obtain
[TABLE]
This is where the proof diverge completely from the deterministic case. Indeed, as in the proof of Theorem 2, we cannot treat directly the previous estimate (which was possible in the deterministic case) and an extra care is needed. To deal with the term with the maximum, we use Markov’s inequality to obtain
[TABLE]
Since , one can choose small enough such that for every large enough.
To deal with the expectation in the denominator in (6), we will need the following lemma (which proof can be found after the proof of the theorem).
Lemma 5**.**
Let . Under the assumptions of Theorem 3 or Theorem 4, we have
[TABLE]
Thus, using this lemma with (7), we have
[TABLE]
Choosing a subsequence such that , the Borel-Cantelli lemma gives that for -almost every , if is large enough then
[TABLE]
and
[TABLE]
Thus, if is large enough
[TABLE]
Thus, (6) together with (8) and (9) gives us that for -almost every , if is large enough then
[TABLE]
where the last inequality came from the definition of and our choice of . Finally, choosing large enough in the definition of and choosing small enough, we obtain that if is large enough
[TABLE]
Since the last quantity is summable in , the Borel-Cantelli lemma gives that for -almost every , if is large enough then
[TABLE]
and then
[TABLE]
Finally, using the same arguments as in the proof of Theorem 2, we have for -almost every
[TABLE]
for -almost every .
Then the theorems are proved since can be chosen arbitrarily small. ∎
Proof of Lemma 5.
As in the previous proof, we take and where and are constants to be chosen later.
First of all, we use Markov’s inequality
[TABLE]
Firstly, we will treat the last term on the previous numerator, using the mixing assumptions (I) and (II)
[TABLE]
To get an estimate on (10), we need to study the term . One can observe that
[TABLE]
We will separate the study of this integral depending on the relative distance and position between and and consider 5 different cases.
Case 1: and are all far from one another, i.e. at least at a distance greater that . We will assume that (when the relative position is different, everything can be done identically because of the symmetry) and that , , . Using the mixing assumptions (I-a) and (II) (a similar estimate is obtained when (III”) is satisfied) we obtain
[TABLE]
Case 2: only two indices are close. We will assume that and that , , . Since the cylinders form a partition and that the sample measures are probability measures, we have
[TABLE]
When the indices are in a different position and/or the two close indices are not and , the same idea can be used. However, one need to choose carefully with which index to take the maximum so that one index disappears with one sum and we obtain a similar term as (13) where the 3 remaining indices are far from each other. Then, we use the mixing assumptions (III’) and (IV’) (a similar estimate is obtained when (III) and (IV) are satisfied) to get
[TABLE]
Case 3: three indices are close and one is far from them. We will assume that and that , , . Since and is a probability measure we have
[TABLE]
When the indices are in a different position, one can use the same idea so that we stay with two indices which are far from each other and measure the same cylinder. Thus we can use the mixing assumptions (I) an (II), to obtain
[TABLE]
Case 4: two indices are close and both are far from the two other indices which are close from one another. We will assume that and that , , . Since the sample measures are probability measures, we obtain
[TABLE]
For the other relative positions, we can observe that
- •
if the measures with the two indices that are far from each other measure different cylinders, we obtain an estimate similar to (16);
- •
if the measures with the two indices that are far from each other measure the same cylinder, the case can be treat as case 3.
Then, using the mixing assumptions (III’) and (IV’) (a similar estimate is obtained when (III) and (IV) are satisfied), we have
[TABLE]
Case 5: all the indices are close. We will assume that and that , , . In this case, the relative position is irrelevant. Since the sample measures are probability measures, we obtain
[TABLE]
Finally, (10) together with (11), (12), (14), (15), (17) and (18) gives us that there exists a constant such that
[TABLE]
where
[TABLE]
We recall that and that . Thus, as in (7), we have
[TABLE]
for every large enough.
Moreover, by definition of and our choice of , we have for every large enough
[TABLE]
First of all, we choose so that for any n large enough
[TABLE]
Then, for the first term in (19), we have
[TABLE]
For the second term in (19), we have
[TABLE]
For the third term in (19), we have
[TABLE]
For the fourth term in (19), we have
[TABLE]
And, for the fifth term in (19), we have
[TABLE]
Finally, putting all these estimates together in (19), choosing and since , we obtain
[TABLE]
∎
Acknowledgements: The author would like to thank Rodrigo Lambert for various comments on a first draft of the paper, Mike Todd for fruitful discussions and for fixing the mistake found in [35] and the referee for useful suggestions to improve the paper.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Abadi and L. Cardeno, Renyi entropies and large deviations for the first-match function, IEEE Trans. Inf. Theory(61), 4 (2015), 1629–1639.
- 2[2] M. Abadi and R. Lambert, From the divergence between two measures to the shortest path between two observables, Ergod. Theory Dyn., 39 (2019), no. 7, 1729–1744.
- 3[3] M. Abadi and N. Vergne, Poisson approximation for search of rare words in DNA sequences , ALEA Lat. Am. J. Probab. Math. Stat. 4 (2008), 223–244.
- 4[4] R. Arratia and M. Waterman, An Erdös-Rényi Law with Shifts, Adv. Math. 55 (1985), 13-23.
- 5[5] R. Arratia and M. Waterman, Critical phenomena in sequence matching , Ann. Probab., 13 (1985), no. 4, 1236–1249.
- 6[6] R. Arratia and M. Waterman, The Erdös-Rényi strong law for pattern matching with a given proportion of mismatches , Ann. Probab., 17 (1989), no. 3, 1152–1169.
- 7[7] R. Arratia and M. Waterman, A phase transition for the score in matching random sequences allowing deletions , Ann. Appl. Probab., 4 (1994), no. 1, 200–225.
- 8[8] R. Arratia, L. Gordon and M. Waterman, An extreme value theory for sequence matching , Ann. Statist., 14 (1986), no. 3, 971–993.
