This paper investigates the asymptotic distribution of the length of the longest common and increasing subsequences in two independent random sequences with arbitrary distributions, revealing a limit expressed via Brownian motion functionals.
Contribution
It establishes the limiting law of the longest common increasing subsequence length for sequences with arbitrary distributions, extending previous results to a broader setting.
Findings
01
Limiting distribution expressed as a functional of Brownian motions
02
Asymptotic normality after proper centering and normalization
03
Applicable to sequences with arbitrary probability distributions
Abstract
Let (Xk)k≥1 and (Yk)k≥1 be two independent sequences of i.i.d. random variables, with values in a finite and totally ordered alphabet Am:={1,…,m}, and having respective probability mass function p1X,…,pmX and p1Y,…,pmY. Let LCIn be the length of the longest common and weakly increasing subsequences in (X1,...,Xn) and (Y1,...,Yn). Once properly centered and normalized, LCIn is shown to have a limiting distribution which is expressed as a functional of two independent multidimensional Brownian motions.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Bayesian Methods and Mixture Models · Stochastic processes and statistical mechanics
Full text
On the limiting law of the length of the longest common and increasing subsequences in random words with arbitrary distributions
Clément Deslandes111C.M.A.P. Ecole Polytechnique, Palaiseau, 91120, France & Georgia Institute of Technology, Atlanta, GA, 30332, USA ([email protected]). Christian Houdré222School of Mathematics, Georgia Institute of Technology, Atlanta, GA, 30332, USA ([email protected]). 333Research supported in part by the
grant ♯524678 from the Simons Foundation.
Keywords: Random Words, Longest Common Subsequences, Longest Increasing Subsequences, Weak Convergence,
Optimal Alignment, Last Passage Percolation, Random Matrices.
MSC 2010: 05A05, 60C05, 60F05.
Abstract
Let (Xk)k≥1 and (Yk)k≥1
be two independent sequences of i.i.d. random variables, with values in a finite and totally
ordered alphabet Am:={1,…,m}, m≥2, having respective probability mass function
p1X,…,pmX and p1Y,…,pmY.
Let LCIn be the length of the longest common and weakly increasing subsequences
in X1,...,Xn and Y1,...,Yn. Once properly centered and normalized, LCIn is shown
to have a limiting distribution which is expressed as a functional of two independent multidimensional Brownian motions.
1 Introduction and preliminary results
1.1 Introduction
We analyze the asymptotic behavior of LCIn, the length of the longest common subsequences in random words with an additional weakly increasing requirement. Throughout, (Xk)k≥1 and
(Yk)k≥1 are two
independent sequences of i.i.d. random variables with values in the finite totally ordered
alphabet Am:={1,…,m}, m≥2, and respective pmf p1X,…,pmX, piX>0, i=1,…,m
and p1Y,…,pmY, piY>0, i=1,…,m.
Next, LCIn, the length of the longest common and weakly increasing subsequences of the two random words
X1⋯Xn and Y1⋯Yn, is the largest integer r∈{1,…,n} such that there exist 1≤i1<⋯<ir≤n and 1≤j1<⋯<jr≤n such that
•
∀s∈{1,…,r}, Xis=Yjs,
•
Xi1≤Xi2≤⋯≤Xir and Yj1≤Yj2≤⋯≤Yjr,
and if no integer satisfies these two conditions, we set LCIn=0.
A thorough discussion of the study of LCIn, with potential applications, and a more complete bibliography,
is present in [2], where the following is further proved (below, as usual, ∧ is short for minimum):
Theorem 1.1**.**
Let Xk and Yk (k=1,2,…) be uniformly distributed over {1,…,m}. Then,
[TABLE]
where BX and BY are two independent m-dimensional standard Brownian motions on [0,1].
The results of [2] extended (and corrected) the proof of
the case m=2 analyzed in [4] and also conjectured the following generalization:
Theorem 1.2**.**
Let Xk and Yk (k=1,2,…) have the same distribution, let pmax=maxi∈{1,…,m}piX and let k∗ be its multiplicity. Then
[TABLE]
where BX and BY are two independent k∗-dimensional standard Brownian motions on [0,1].
Clearly, in case k∗=m,
the two limiting distributions in (1.1) and (1.2) are the same
but they differ otherwise. Indeed, (1.1)
involves two independent m-dimensional Brownian motions while
(1.2) involves k∗-dimensional ones.
So, in particular, if k∗=1, then the right-hand side of (1.2) is just
the minimum of two independent centered normal random variables. In view of the results
obtained in the one-sequence case, e.g., see [5], [1], and the many references
therein, it is tantalizing to conjecture that both the right-hand side of (1.1)
and of (1.2) can be realized as maximal eigenvalues of some
Gaussian random matrix models.
Below, we aim to obtain the limiting distribution of LCIn, without
assuming that the Xk and Yk (k=1,2,…) have the same distribution; providing also an alternative proof of
Theorem 1.1 as well as a proof of the conjectured (1.2).
A brief description of the content of our notes is as follows: the rest of the current section is devoted to studying
the asymptotic mean of LCIn. This asymptotic mean result is already not so predictable and allows for the proper centering in the limiting theorem
whose proof is provided in the next section. The third and final section
is mainly devoted to studying extensions and complements,
such as results for sequences with blocks and infinite countable alphabets.
Acknowledgements: We sincerely thank an Associate Editor and a referee for their
detailed readings and numerous comments which
greatly helped to improve this manuscript.
1.2 Probability
For i∈{1,…,m} and j∈{1,…,n}, let ℓ∈N={0,1,2,…} be such that j+ℓ≤n+1, and let
[TABLE]
be simply the number of letters i between, and including, j and j+ℓ−1 in X1,...,Xn (resp. Y1,...,Yn), with the convention that the sum is zero in case ℓ=0.
From the very definition of LCIn, it is clear that
[TABLE]
Next, let Λ={λ∈(R+)m=[0,+∞)m:λ1+⋯+λm=1}. For λ∈Λ, let
[TABLE]
where ⌊.⌋ is the usual integer part, aka the floor, function. When λ runs through Λ, ℓn(λ)=(ℓn(λ)1,…,ℓn(λ)m) runs exactly
through {ℓ∈Nm:ℓ1+⋯+ℓm=n}, so
[TABLE]
For ease of notations, throughout the paper, for all x∈(Rm)2, we
write x=(xX,xY) so, for example, above, λX,λY∈Λ becomes λ∈Λ2.
The above identity provides a representation of LCIn as a maximum over the locations, λ∈Λ2, where to pick in each word X1,…,Xn and Y1,…,Yn, the letters 1,2,…,m in order to form a common sub-word. This is different from the approach in [2], where the maximum is over
the numbers of letters 1,2,…,m in a common sub-word. Of course the two representations are equivalent. However, the advantage of our approach is that λ takes its values in a deterministic set, as opposed to a random set.
In order to keep dealing with maxima it will be convenient to replace
Bin in (1.5) by its continuous alternative: for i∈{1,…,m} and t∈[0,1],
let
[TABLE]
Next define Vn,X, Vn,Y just as in (1.6) and (1.7),
replacing B by B, and let
[TABLE]
Our analysis rests upon estimating the variations of Bin,X and of Bin,Y.
To do so, let η∈(0,1/6) and let Anη be the event:
[TABLE]
By Hoeffding’s inequality,
[TABLE]
and so if Anη occurs, then for all x,y in [0,1] and i∈{1,…,m},
[TABLE]
and in particular,
[TABLE]
and the same applies to Y instead of X.
1.3 Asymptotic mean: distinct cases
Let us investigate the limiting behavior of LCIn/n. From (1.8),
[TABLE]
Note that ∣Vin,X(λX)−Vin,X(λX)∣≤1/n (and similarly for Y). Thus, using (throughout the paper) the following elementary inequality, valid for any a,b,c,d∈R,
[TABLE]
we get
[TABLE]
Moreover, if Anη occurs, then for all λ∈Λ2,
[TABLE]
so, letting f:(Rm)2→R be given via
[TABLE]
we have:
[TABLE]
By the Borel-Cantelli lemma (recalling (1.9)), almost surely, eventually
Anη occurs so LCInc/n and LCIn/n both converge almost
surely to
[TABLE]
From
[TABLE]
we also get by dominated convergence
[TABLE]
One can think of emax as the length ratio of the longest common and
increasing subsequences in a continuous, non-probabilistic setup:
the letters have density masses p1X,p2X,…,pmX and
p1Y,p2Y,…,pmY.
Now, let
[TABLE]
and let ϕ:Rm→R be given by
ϕ:u↦u1+⋯+um.
On U, there is a correspondence between f in (1.12),
and the above ϕ. Indeed, for λ∈Λ2, defining u by
ui=(piXλiX)∧(piYλiY), f(λ)=ϕ(u),
and for u∈U, there exists λ∈Λ2,
such that λiX≥ui/piX and λiY≥ui/piY
so that f(λ)≥ϕ(u).
Therefore, emax=maxu∈Uϕ(u).
Also, let
[TABLE]
The above correspondence provides for each element of KΛ2 an
element of LU, and for each element of LU at least one element of
KΛ2 (if one of the two inequalities defining U is strict, then there is
more than one way to define the corresponding λ).
Next, let I be the set of integers i∈{1,…,m} such that there exists ui∈LU
with uii>0. One can think of I as the letters that can be used to
maximize ϕ, or, equivalently, to maximize f. Let
[TABLE]
so uI∈LU and for all i∈I, uiI>0.
Thanks to the above correspondence, we define (and will use throughout
the paper) a∈Λ2 such that aiX=aiY=0 for all i∈/I and aiX≥uiI/piX, aiY≥uiI/piY,
for all i∈I (a is a correspondent of uI). Since f(a)≥ϕ(uI)=emax, a∈KΛ2. We shall see, and use, that when restricting the alphabet to I, asymptotically (when properly centered and normalized) the distribution of LCIn remains unchanged.
Two distinct cases need to be analyzed in order to study the limiting distribution of LCIn.
Case a)
There exists u∈LU such that p1Xu1+⋯+pmXum=1 and
p1Yu1+⋯+pmYum<1.
For example, when pX=(3/8,3/8,1/4) and pY=(1/2,3/8,1/8).
Here the maximum is 3/8, and I={1,2}.
Heuristically, this case indicates that the length of the common words is limited by the word
X1⋯Xn and not by Y1⋯Yn.
Using the correspondence between LU and KΛ2, this case is equivalent to
the following statement: there exists λ∈KΛ2 such that for all i∈{1,…,m},piXλiX≤piYλiY with at least one strict inequality. In this case, one has:
Lemma 1.3**.**
Let pmaxX=maxi∈{1,…,m}piX. Then I={i∈{1,…,m}:piX=pmaxX} and emax=pmaxX. Moreover there exists i1∈I such that pi1Y>pmaxX.
Proof.
Let i,j∈{1,…,m} be such that piX<pjX, and assume, by contradiction, that i∈I.
Let u∈LU satisfying p1Xu1+⋯+pmXum=1 and p1Yu1+⋯+pmYum<1, and let v=(ui+u)/2, so that v∈U, vi>0,
p1Xv1+⋯+pmXvm≤1 and p1Yv1+⋯+pmYvm<1.
Let, for ε>0, v(ε) be the vector v except at the
coordinates i and j where v(ε)i:=vi−εpiX
and v(ε)j:=vj+εpjX.
It is clear that, when ε is small enough,
v(ε)∈U and ϕ(v(ε))=emax+ε(pjX−piX)>emax,
leading to a contradiction.
Hence I⊂{i∈{1,…,m}:piX=pmaxX}.
Reciprocally, let i∈{1,…,m} be such that piX=pmaxX and let j∈I.
If i=j we are done. Otherwise, one can slightly change u by adding ε to the ith coordinate and subtracting ε to the jth coordinate so that ϕ(u) remains unchanged, and u is still in U (for ε small enough), so I={i∈{1,…,m}:piX=pmaxX}.
Since p1Xu1+⋯+pmXum=∑i∈IpmaxXui>∑i∈IpiYui, there exists i1∈I such that pi1Y>pmaxX.
It is finally clear that emax=pmaxX, completing the proof.
∎
As a consequence of the above lemma, we prove next that
[TABLE]
(in particular, this set is non-empty which is all that is really needed in the rest of the proof).
To show this equality,
first note that {λX:λ∈KΛ2}⊂J since, indeed,
when λ∈KΛ2, for every i∈I, pmaxXλiX≤piYλiY and then
take the sum.
Conversely, if λX∈J, ∑i∈IpmaxXλiX/piY≤1,
so let λY be such that
for every i∈I, λiY≥pmaxXλiX/piY and ∑i∈IλiY=1, while for i∈Ic,
let λiY=0. Clearly, λ∈KΛ2.
Case b)
For all u∈LU, p1Xu1+⋯+pmXum=p1Yu1+⋯+pmYum=1.
Heuristically, this second case indicates that in order to form the longest common words, it is
necessary to make full use of both words. Using the correspondence between LU and KΛ2, this case is equivalent to the following: for all λ∈KΛ2,
for all i∈{1,…,m},piXλiX=piYλiY. We can further distinguish two
subcases, namely, we are in Case b1) if each coordinate of PX:=(1/piX)i∈I∈RI is equal to each coordinate of
PY=(1/piY)i∈I∈RI, and in Case b2) otherwise.
For example, if pX=(1/3,1/3,2/9,1/9) and pY=(1/3,1/3,1/9,2/9), we are in Case b1) and emax=1/3. If pX=(2/3,1/6,1/6) and pY=(1/6,2/3,1/6), we are in
Case b2) and emax=4/15. In both of these examples, I={1,2}.
Below Span(PX) (resp. Span(PY)) is the linear span of PX (resp. PY).
Lemma 1.4**.**
In Case b2), there exists a unique pair of reals s,t such that
sPX+tPY=(1)i∈I
Proof.
The only alternatives to Case b1) are: PX and PY are linearly independent, or PX and PY are linearly dependent and PX=PY. If the latter, given that PX and PY have positive coordinates, PX<PY (coordinate by coordinate) or PY<PX. But PX<PY clearly implies that Case a) occurs, and not Case b) leading to a contradiction (and similarly PY<PX).
Therefore, the only alternative to Case b1) is for PX and PY to be linearly independent. We now prove that H:=(1)i∈I∈Span(PX,PY). To do so, we use an
elementary duality result: if E is a finite-dimensional space with dual
E∗, and if l1,l2,l3∈E∗, then Ker(l1)∩Ker(l2)⊂Ker(l3) if and only if l3∈Span(l1,l2). Indeed, considering the restrictions l2∣Ker(l1) and
l3∣Ker(l1) of l2 and l3 to the subspace Ker(l1), we have Ker(l2∣Ker(l1))⊂Ker(l3∣Ker(l1)). Therefore, l3∣Ker(l1)=λl2∣Ker(l1) for
some λ∈R, and if u∈/Ker(l1), then l3=λl2+l1(u)l3(u)−λl2(u)l1
(because this is true on Ker(l1) and on u).
So, returning to our problem, H∈Span(PX,PY) is equivalent to: Ker(PX∗)∩Ker(PY∗)⊂Ker(H∗), where for any L∈RI, L∗ denotes the linear form defined by L∗(y)=L⋅y. Let x∈Ker((PX)∗)∩Ker((PY)∗). Clearly, there exists ε>0 such that uI+εx and uI−εx have non-negative coordinates,
and so they are in LU, and
H∗(uI+εx)=H∗(uI−εx)=emax otherwise one of them
would be greater than emax, hence x∈Ker(H∗).
∎
For instance, taking again
pX=(2/3,1/6,1/6) and pY=(1/6,2/3,1/6),
we get PX=(3/2,6),PY=(6,3/2) and s=t=2/15.
Without loss of generality (switching the roles of X and Y), one can thus assume that either
Case a) or Case b) occurs.
In Case b), the following technical lemma, whose proof (given in the Appendix) is not crucial to
understand the rest of this manuscript, is needed to state our main theorem.
Let us define first, in Case b1),
[TABLE]
and, similarly,
[TABLE]
It is clear, from the definition of I, that if i∈I is such that piX≥emax,
then piY<emax, therefore sX and sY are well defined
and one can check that sX,tX,sY,tY∈[0,1].
In order to state our next lemma, below let
E={x∈Rm:x1+⋯+xm=0} and let
E′={x∈E:∀i∈Ic,xi≥0}.
Lemma 1.5**.**
Let ν∈(Rm)2 be such
that for all i∈Ic,νiX=νiY=0, then the following maximum is well defined:
[TABLE]
and
[TABLE]
for some constant C>0, depending only on
pX and pY, as given in Lemma 2.3.
In Case b1), writing S∙:=∑i∈Iνi∙, then
[TABLE]
In Case b2), and recalling the notations of Lemma 1.4, then
[TABLE]
1.4 Representation of emax
We now aim to give a more explicit expression for emax defined by (1.13).
To do so, let us start with the following lemma which asserts that, in the non-probabilistic setup,
"two letters are enough to reach the maximum".
Lemma 1.6**.**
There exist i,j∈{1,…,m} and λ∈KΛ2
such that for all k∈/{i,j},λkX=λkY=0.
Proof.
Let u∈LU having (at least) three non-zero coordinates. Then,
recalling the correspondence between LU and KΛ2, in order
to prove the result it is enough to show that there exists a
v∈LU having one less null coordinate.
Without loss of generality, let u1,u2,u3>0, and let
[TABLE]
Since the dimension of V is at least one, let x∈V∖{0}.
Then clearly, there exists t∈R such that v:=u+tx has
non-negative coordinates and one more null coordinate than u.
Moreover, v∈LU, which completes the proof.
∎
If there exists u∈LU such all its coordinates except one, call it i, are zeros,
then emax=piX∧piY. Otherwise, let i,j be defined as in the
statement of the lemma. At first, assume that piX=pjX and that piY≤pjY,
then emax≤(λiXpiX∧λiYpjY)+(λjXpiX∧λjYpjY)≤(λiXpiX+λjXpiX)∧(λiYpjY+λjYpjY)=piX∧piY, so emax=piX∧piY and we are actually in the first case, giving
a contradiction. Similarly, if piX≤pjX and piY≤pjY, using λiXpiX∧λiYpiY≤λiXpjX∧λiYpjY we get a contradiction as well.
Therefore, in the second case, necessarily, possibly permuting i and j, piX<pjX and piY>pjY.
Additionaly, it is necessary to have that piX<piY, otherwise emax=piY and we
are in the first case. Similarly, pjY<pjX. Then, in this case, the maximum is when
the quantities in each minima are equal, and so one shows that
[TABLE]
Therefore,
[TABLE]
Note that
[TABLE]
where the left inequality is clear, while the right one is easily seen
from the expression of f. Note also that above, emax
is equal to the lower bound when the second max in (1.23) is over the
empty set, and is equal to the upper bound when there exists i such that pmaxX=piX≤piY or pmaxY=piY≤piX.
When pX=pY (same distribution for the two words), we see that emax=maxi∈{1,…,m}piX is minimal when pX is uniform (for a given alphabet). This is to be contrasted
with the case of the length of the longest common subsequences, LCn (defined just as LCIn,
but without the increasing condition). Indeed, little is known about γ∗:=limn→+∞ELCn/n, for instance whether or not it is
minimal (for a given alphabet) for the uniform distribution.
Since LCn is defined with one less constraint than LCIn, clearly emax≤γ∗
which is of potential interest since the exact value of γ∗ is unknown, even in the
binary uniform case. (This last inequality provides a lower bound on γ∗, no matter the distributions on the letters. For uniform letters, emax=1/m, although it is known that, then, asymptotically, γ∗∼2/m, see [7].)
1.5 A criterion to distinguish the three cases
For a given distribution, it is not completely apparent which
situation is in play as far as
the respective cases a), b1) and b2) are concerned.
Our next result makes this more transparent. First, set
Let e1<e2, then Case b2) holds true. Let e1≥e2, then:
(i) If for some i∈{1,…,m} such that piX∧piY=e1, one has piX=piY, then Case a) holds true or so does its symmetric version: there exists u∈LU such that p1Yu1+⋯+pmYum=1 and p1Xu1+⋯+pmXum<1.
(ii) Otherwise, i.e., if for all i∈{1,…,m} such that piX∧piY=e1, one has
piX=piY, then if e1>e2 Case b1) holds true, while if e1=e2, then so
does Case b2).
Proof.
First, for any 0<δ<1, let emax,δ, e1,δ, e2,δ and eδ(i,j) be defined just as emax,e1,e2 and e(i,j) but replacing piY with δpiY, for all i∈{1,…,m}.
Next, from the very definition of Case a): There exists u∈LU such that p1Xu1+⋯+pmXum=1 and p1Yu1+⋯+pmYum<1. Letting δ0:=p1Yu1+⋯+pmYum, we have δ0p1Yu1+⋯+δ0pmYum=1 so emax,δ0≥emax and therefore (clearly, emax,δ is non-decreasing in δ) emax,δ0=emax. So when Case a) occurs there exists 0<δ0<1, such that
for all δ∈(δ0,1],emax,δ=emax, and one can easily check the converse.
A similar result continues to hold for the symmetric version of Case a).
We can now prove the statement of the theorem by distinguishing the following four occurrences.
(1) Let e1<e2. Let 0<δ0<1 be close enough to 1 such that for any
δ∈(δ0,1], the set of
pairs i,j∈{1,…,m} such that \begin{subarray}{c}p^{X}_{i}<\,p^{X}_{j}\\
\rotatebox{90.0}{\scriptstyle>}\quad\>\rotatebox{90.0}{\scriptstyle<}\\
\>p^{Y}_{i}>\,p^{Y}_{j}\end{subarray} is equal to the set of i,j∈{1,…,m} such that
\begin{subarray}{c}p^{X}_{i}<\,p^{X}_{j}\\
\rotatebox{90.0}{\scriptstyle>}\quad\>\rotatebox{90.0}{\scriptstyle<}\\
\>\delta p^{Y}_{i}>\,\delta p^{Y}_{j}\end{subarray}. Since for every i,j in this set, it is immediate to check that e(i,j)>eδ(i,j), the maximums satisfy e2>eδ,2. Since e1<e2, by continuity, for δ close enough to 1, max(eδ,1,eδ,2)=eδ,2 so eδ,max<emax, hence we are in Case b). There are i,j∈{1,…,m} such that emax=e2=e(i,j), so
i,j are in I, but piX<pjX so we are
in Case b2).
(2) Let e1≥e2, and let there exist i∈{1,…,m} such that piX∧piY=e1 and piX=piY, say, piX<piY. Then, the very definition of Case a) is verified with the vector u∈Rm
having coordinates equal to zero except for ui=piX. If instead, piX>piY then the symmetric
case holds true.
(3) Let e1>e2 and let for all i∈{1,…,m} such that piX∧piY=e1, piX=piY. By continuity, for δ close enough to 1, max(eδ,1,eδ,2)=eδ,1=δemax so we are in Case b). Additionally, one verifies that under our assumptions I is restricted to the set
of i∈{1,…,m} such that piX=piY=emax. Therefore, we are, in fact, in Case b1).
(4) Let e1=e2 and let for all i∈{1,…,m} such that piX∧piY=e1, piX=piY. From what is done above, we see that for δ close enough to 1, eδ,max<emax hence we are in Case b). Once again, since there are i,j∈{1,…,m} such that emax=e2=e(i,j), we are in Case b2).
∎
To present another explicit example, let us fully corner the case m=2,
with p1X,p2X,p1Y, and p2Y. The following completely describes the various cases:
•
If p1X=p1Y, then (since, necessarily, p2X=p2Y) emax=max(p1X,p2X)=max(p1X,1−p1X) and we are in Case b1).
•
If p1X=p1Y and 1/2∈(min(p1X,p1Y),max(p1X,p1Y)), then
[TABLE]
and we are in Case a) or its symmetric.
•
If p1X=p1Y and 1/2∈/(min(p1X,p1Y),max(p1X,p1Y)), then
[TABLE]
and we are in Case b2).
2 The limiting law
It is clear, from the previous section, that the proper way to center (and normalize) LCIn is via
and therefore the convergence in distribution of Znc will imply the convergence, in distribution, of Zn towards the same limit.
2.1 Statement of the theorem
Below is the main result of the paper. In this statement,
the covariance matrices of the Brownian motions stem from the covariance matrix of the
rescaled variables (\mathds1Xk=i)i∈I
(resp. \mathds1Yk=i,i∈I) used to construct the polygonal
approximations Bin,∙ (here, and throughout,
∙ is short for either X or Y). Indeed, note that \mathdsE(piX(1−piX)pjX(1−pjX)(\mathds1Xk=i−piX)(\mathds1Xk=j−pjX))=−(1−piX)(1−pjX)piXpjX (with a similar result for Y).
Theorem 2.1**.**
Let BX and BY be two independent ∣I∣-dimensional Brownian motions defined on
[0,1] with respective covariance matrix CX defined by Ci,iX=1 and Ci,jX=−(1−piX)(1−pjX)piXpjX, for i=j in I, and CY defined in a similar fashion, replacing piX by piY and pjX by pjY. For all λ∈KΛ2 and i∈I, set
[TABLE]
If there exists u∈LU such that p1Xu1+⋯+pmXum=1 and
p1Yu1+⋯+pmYum<1 (Case a)), then
If for all u∈LU, p1Xu1+⋯+pmXum=p1Yu1+⋯+pmYum=1 (Case b)), then
[TABLE]
where m is given by (1.19).
At this point, one can remark that emax is invariant with respect to the order in which
the letters are chosen, and that
both in Case a) and Case b1), the above limiting laws are invariant as well (to see this fact in Case a), recall Lemma 1.3). Therefore, in Case a) and Case b1), no matter the prescribed order (increasing, decreasing, etc..) the asymptotic behavior of the length of the corresponding optimal alignments is the same.
We refer the reader to Section 3.2 for more general results of this flavor.
In Case b2) it is less clear that the limiting distribution is permutation-invariant as it might not just boil down
to m(ν). Indeed, in Case b2) the limiting law can be written as
the law of
[TABLE]
where V(λ) is in (Rm)2,
and defined via
[TABLE]
where the Bi∙ are Brownian motions which are, up to a multiplicative factor, as in our main theorem. Further introducing, for any permutation
σ of {1,…,n}, Vσ(λ) defined via
[TABLE]
we have V(λ)=VId(λ), where Id is the identity permutation.
When the letters are not required to be
non-decreasing, but instead follow an
order given by σ, the limiting law is simply the law of
Zσ:=maxλ∈KΛ2∑i∈{1,…,m}∙∈{X,Y}Vσ(λ)i∙.
It is still not that clear whether or not
this last quantity depends on σ. For example, if m=3 and KΛ2=Λ2 and B1X is a standard Brownian motion, while all others are null, define σ by σ(1)=2,σ(2)=1,σ(3)=3, then with probability one Zσ>ZId. However, in Case b2), it is actually
not possible to have KΛ2=Λ2 (and also to have only one non null Brownian motion) but this shows that a general argument for the validity of the permutation-invariance is not that transparent.
The proof of this theorem is based on a non-probabilistic lemma. First, let Enη be the set of all continuous functions b from [0,1] into R such that: for all x,y in [0,1], ∣b(y)−b(x)∣≤(nη∣y−x∣+nη−1/2)/2. Then, for all b∈(Enη)m,
i∈{1,…,m} and λ∈Λ, set vib(λ)=bi(λ1+⋯+λi)−bi(λ1+⋯+λi−1), and for all bX,bY∈(Enη)m and λ∈Λ2 let
[TABLE]
One can think of biX (resp. biY) as piX(1−piX)Bin,X(ω)
(resp. piY(1−piY)Bn,Y(ω)) for a fixed ω∈Anη, where the symbol bX (resp. bY) is used for ease of notation
and in order to emphasize the non-probabilistic nature of the proof. For further ease of notation,
we omit the dependency in bX and bY in the notation zn.
This omission is also present in v and vX is just short for vbX (similarly
with Y), and further write v(λ):=(vX(λX),vY(λY)).
In Case a), for all λX∈Λ, let
[TABLE]
In Case b), for all λ∈Λ2, let
[TABLE]
Next, let us finally present two simple inequalities stemming from the very
definition of Enη, often used in the sequel,
which are valid for all b∈Enη, λ,λ′∈Λ, i∈{1,…,m},
∙∈{X,Y}, namely,
[TABLE]
[TABLE]
Lemma 2.2**.**
There exists a sequence (εn)n≥1 of positive reals converging to zero and
such that for all n≥1 and bX,bY∈(Enη)m,
either ∣maxλ∈Λ2zn(λ)−maxλ∈Jza(λ)∣≤εn, or ∣maxλ∈Λ2zn(λ)−maxλ∈KΛ2zb(λ)∣≤εn, in Case a) or b), respectively.
The proof of this crucial lemma is delayed
to the next subsections, and instead we turn our attention to the proof of the main theorem.
For all ω∈Anη, Bn,X(ω) and Bn,Y(ω) are in Enη so by Lemma 2.2, ∣Znc(ω)−Znb(ω)∣≤εn. So Znc−Znb\mathds1Anη≤εn, but Znc−Znb=(Znc−Znb)\mathds1Anη+(Znc−Znb)\mathds1(Anη)c, where this second term tends to zero in probability, therefore so does Znc−Znb. Next, by Donsker’s theorem and the continuity of m (recalling Lemma 1.5), Znb tends to Zb in distribution, so does Znc and finally so is the case for Zn, recalling (2.1). The proof in the Case a) is analogous and therefore omitted.
∎
Let us now turn to the proof of Lemma 2.2. The method of proof goes as follows: Maximizing zn(λ) is equivalent to maximizing
[TABLE]
which converges, as n goes to infinity, to f(λ)−emax. So one can expect that λ must "almost" be maximizing f, i.e., be in or "close to" the set KΛ2. In Case a), we bound the
maximum by taking the maximum over two sets which are closer and closer to the set J.
In Case b), first write λ=λKΛ2+λr
(actually dealing with a λ−a in order to have a vector space, but the idea is the same), then ignore the small perturbation term λr in v, and the idea is (roughly) to fix λKΛ2 and to find the maximum over λr. In both cases, the end of the proof consists in showing how the maximum of the relevant function (za or zb) over a set of parameters that "tends to" a limiting set goes to the maximum over this limiting set.
First, fix b=(bX,bY)∈((Enη)m)2. Next,
for ease of notation, omit in the sub-index b in z and v.
Roughly speaking, we begin by proving that any λ maximizing zn must have "small" coordinates outside of I, and therefore we can "replace" the variations vi., for i∈/I,
by zero.
Let
[TABLE]
Let us assume first that I={1,…,m}. Then by Lemma 1.3, psecX<pmaxX. Our first observation is that if λ maximizes zn, i.e., if
zn(λ)=maxλ∈Λ2zn(λ), then
[TABLE]
In words, the above indicates that the contribution of the letters not in I is, as expected,
very limited.
To prove this inequality, note that on the one hand (recalling Lemma 1.3 and (2.6)),
[TABLE]
while on the other hand, for λ~∈KΛ2, using (2.6)
and the elementary inequality (1.10),
[TABLE]
The inequality (2.9) follows, and it therefore allows,
for i∈/I, to replace
the terms viX(λX)
by zero. More precisely, let for all λ∈Λ2,
[TABLE]
then as shown next,
[TABLE]
and this inequality remains true when I={1,…,m} (since then maxλ∈Λ2zn(λ)=maxλ∈Λ2znI(λ) and ∣Ic∣=0).
Indeed, let λ∈Λ2 be such that zn(λ)=maxλ∈Λ2zn(λ). Using (1.10) along with (2.6) (λiX≤2mnη−1/2/(pmaxX−psecX), for all i∈/I), it follows that
[TABLE]
Moreover, let λ~∈Λ2 be such that maxλ∈Λ2znI(λ)=znI(λ~). Then, just as in proving (2.9), it follows that
∑i∈/Iλ~iX≤2∣I∣nη−1/2/(pmaxX−psecX). Hence
[TABLE]
which completes the proof.
2.3.2 Bounds on the maximum with different sets of constraints
Let us next define two sets "close" to J. To do so, let Sn=2∣I∣2nη−1/2, let CI=∑i∈IpiY1, let Tn=CI2nη−1/2, and finally let
[TABLE]
and
[TABLE]
Note that by Lemma 1.3, setting δi1=(\mathds1i=i1)i∈{1,…,m},
δi1∈Jn− eventually.
We show, in this part of the proof, that
[TABLE]
Let us prove the upper bound first. Let λ∈Λ2 be such
that znI(λ)=maxλ∈Λ2znI(λ), and
let S be the unique real such that
[TABLE]
Then, there exists i0∈I such that,
[TABLE]
since otherwise, ∑i∈IλiY>1, which is a contradiction.
Then, using the following inequalities,
[TABLE]
leads to
[TABLE]
Just as in obtaining the inequality (2.10), we have −∣I∣nη≤znI(λ), hence
S≤2∣I∣2nη−1/2, i.e., λX∈Jn+, leading to conclude with the
upper estimate:
[TABLE]
Let us now turn our attention to the lower bound. Let λX∈Jn− be such that za(λX)=maxλ∈Jn−za(λ).
Since
[TABLE]
there
exists λY∈Λ such that for i∈I, λiY≥(pmaxXλiX+2nη−1/2)/piY and for i∈/I, λiY=0. For all i∈I,
[TABLE]
Therefore,
[TABLE]
and maxλ∈Jn−za(λ)≤maxλ∈Λ2znI(λ).
2.3.3 End of the proof
Both quantities ∣maxλ∈Jn−za(λ)−maxλ∈Jza(λ)∣
and ∣maxλ∈Jn+za(λ)−maxλ∈Jza(λ)∣ still need to be investigated. Let C1=(1−pi1YpmaxX)>0. For λX∈Λ and t∈(0,1), let λX,t=tδi1+(1−t)λX. It is straightforward to prove that for all n greater than some constant, depending only on η, pX and pY, and for all λX∈J, λX,C1Tn is well defined, and is in Jn−, while for
all λX∈Jn+, λX,C12Sn∈J.
Fix b=(bX,bY)∈((Enη)m)2. Just as in Case a), we omit in the
notation
the sub-index b. Let E={x∈Rm:x1+⋯+xm=0}, let K be the
subspace of E2 defined by
[TABLE]
and let P (recalling the definition of a following (1.15): a∈KΛ2, for all i∈I,piXaiX=piYaiY>0, for i∈/I,ai∙=0, and f(a)=emax) be given by:
[TABLE]
Note that Λ2=a+P. By definition of the case b), for all λ∈KΛ2,
for all i∈IλiXpiX=λiYpiY, while for all i∈/I, λiX=λiY=0. Reciprocally, let λ∈Λ2 such that for all i∈IλiXpiX=λiYpiY and for all i∈/I, λiX=λiY=0, we show that λ∈KΛ2. Let u∈RI be defined by ui=piXλiX−piXaiX
for all i∈I. We have that u⋅PX=u⋅PY=1−1=0 so by
Lemma 1.4, u⋅(1)i∈I=0, hence the result.
This characterization of KΛ2, combined with Λ2=a+P,
gives us
[TABLE]
Since piXaiX=piYaiY, for all i∈{1,…,m},
[TABLE]
Clearly,
[TABLE]
Note also that for all x∈(Rm)2, f(a+x)=f(a)+f(x) so by (2.14)
[TABLE]
Our next result is an elementary projection result.
Lemma 2.3**.**
There exists C>0 depending only on pX and pY such that for all x∈P,
there exist xK∩P∈K∩P and xr∈E2
such that x=xK∩P+xr and ∥xr∥∞≤−Cf(x).
Proof.
Let K⊥ be the orthogonal complement of K in E2 (for the usual Euclidean inner product defined on E2 by, for x,y∈E2, x⋅y:=x1Xy1X+⋯+xmXymX+x1Yy1Y+⋯+xmYymY). Let x∈P (so x∈E2) and let (xK,xK⊥) be its orthogonal decomposition, i.e., xK∈K, xK⊥∈K⊥ and x=xK+xK⊥. Without loss of generality, assume xK⊥=0. For ease of notation, set g=−f. Let
[TABLE]
In order to bound the image of xK⊥, we first rescale it to make it an
element of P: it is easy to check that y:=(∥xK⊥∥∞amin)xK⊥∈P. Now, consider the sphere,
[TABLE]
Then, Samin∩P is a non-empty compact set, so let
[TABLE]
Recalling (2.15), M>0. Since y∈Samin∩P, M≤g(y) so that, using
g(xK⊥)=g(x),
[TABLE]
This is almost the desired result, except that xK might not be in P. Let us assume, firstly, that
g(x)≤M (and therefore that ∥xK⊥∥∞≤amin). Let xK∩P=(1−amin∥xK⊥∥∞)xK and let xr=amin∥xK⊥∥∞xK+xK⊥. We next prove that xK∩P∈K∩P. Since x∈P, for i∈I,
[TABLE]
and for i∈/I, xiK∩P=0, since xK∩P∈K. So xK∩P∈K∩P.
Let us turn to xr. Since a+x∈Λ2, ∥x∥∞≤1. Moreover, xK is the orthogonal projection
of x so ∥xK∥∞≤2m∥x∥∞≤2m and
[TABLE]
Setting C:=(2m+amin)/M, we have just proved that if g(x)≤M, then
there exist suitable xK∩P and xr satisfying the lemma.
Finally, if g(x)>M, we let xK∩P=0 and
xr=x, so that ∥xr∥∞≤1<g(x)/M<Cg(x) which completes the proof.
∎
2.4.2 Separation of the parameters
To begin with, we prove that maxx∈Pzn(a+x) can be written as a maximum over two
kind of parameters, one belonging to K in the variations vi., the other one being a
small remaining term.
Let x∈P be such that zn(a+x)=maxλ∈Λ2zn(λ). Then,
and applying Lemma 2.3 to (2.16) gives
maxx∈Dnzn(x)=maxx∈Pzn(a+x).
Let us next define a slight modification of zn by letting, for
all (xK∩P,xr)∈Dn,
[TABLE]
The parameters are now "separated".
For all (xK∩P,xr)∈Dn, by (2.7),
[TABLE]
so that
[TABLE]
2.4.3 Independence of the parameters
A major issue with Dn is the condition xK∩P+xr∈P. We would rather have a
set of possible values for xr independent of the value of xK∩P. To try to achieve that
goal, let
[TABLE]
and let Dn′⊂Dn be given by
[TABLE]
Now, recalling the definition E′={x∈E:∀i∈Ic,xi≥0}⊂E,
we have that
[TABLE]
For (xK∩P,xr)∈Dn, and for n large enough so that amin2Cmnη−1/2≤1, it follows that, letting x′K∩P:=(1−amin2Cmnη−1/2)xK∩P, (x′K∩P,xr)∈Dn′, so by (2.7)
Let us now prove that for n large enough, maxx∈Dn′znη(x)=maxλ∈a+K∩Pnm(vX(λX),vY(λY)).
Fix xK∩Pn∈K∩Pn. Applying the previous lemma to
ν:=v(a+xK∩Pn), since ∥ν∥∞≤nη, by Lemma 1.5
and so, using
(2.17), (2.18) and (2.19) (recall that a+K∩P=KΛ2),
[TABLE]
3 Consistency with previous results and generalizations
3.1 Two words with identical distributions
As stated in the introductory section, Theorem 1.1 and the conjectured Theorem 1.2 are
consequences of our main theorem. Indeed, let Xk and Yk (k=1,2,…) have the same distribution, then note that
[TABLE]
and so the multiplicity k∗ of pmax is equal to ∣I∣ and we are
in Case b1).
It is also clear that
[TABLE]
In this case, Lemma 1.5 simplifies and gives m(ν)=SX∧SY, so our theorem states that the limiting distribution of Zn is
[TABLE]
where BX and BY are two independent k∗-dimensional Brownian motions on [0,1] with respective
covariance matrix defined in Theorem 2.1.
The proof of Corollary 3.3 in [5] shows that, by writing BX and BY as
linear combinations of independent standard Brownian motions, (3.1) is identical
in law to
[TABLE]
where now BX and BY are two independent
k∗-dimensional standard Brownian motions on [0,1].
Dividing both sides by pmax, one
obtains the conjectured Theorem 1.2 which reduces to
Theorem 1.1 when
k∗=m.
3.2 Generalization to any fixed sequence of blocks
As pointed out by an Associate Editor, and also developed,
for binary alphabets, in [8], a longest common increasing
subsequence can be viewed as a longest common subsequence where letters
are aligned in blocks. (For LCIn, a non-void block only aligns a single type of letter
and the first block consists of the letter α(1):=1, then the second one consists of α(2):=2 and so on, up to the last block eventually consisting of the letter α(m):=m.) So, more generally, one could investigate the longest common subsequences where letters are aligned in blocks of letters
α(1),…,α(l), for any l≥m, and where α:{1,…,l}→Am is onto. For any fixed α, the length of the longest common subsequences where letters are aligned with blocks α is at most equal to LCn, the length of the longest common subsequences,
and moreover, LCn is the maximum of these lengths over all the possible block-orders α
(l is not fixed). To pass
from the block version to LCn, there is, however, a major issue of interversion of limits. In what
follows, at first, we merely give for any fixed α, the limiting law of the length of the (rescaled) longest common subsequences where letters are aligned in blocks α(1),…,α(l), and then
the corresponding limiting laws, when allowing
for a fixed numbers of such blocks.
Firstly, defining for any k∈N, k≥2, Λk:={λ∈(R+)k=:λ1+⋯+λk=1}, we claim that:
[TABLE]
Indeed to see the validity of this equality, note that above the left-hand side is greater or equal than the right-hand side since α is onto, while it is also less or equal since we can partition {1,…,l} via
α−1({1}),α−1({2}),…,α−1({m}) and use the basic inequality
(a∧b)+(c∧d)≤(a+c)∧(b+d).
Next, to adapt the proof of our main theorem, we need to define the set Uα, as well as all other quantities which depended on m or p, with l instead of m and
pα(1)∙,…,pα(l)∙ instead of p1∙,…,pm∙.
Note also that, when l>m, the quantities pα(1)∙,…,pα(l)∙ do not form a
probability mass function (their sum is not equal to one), but all their elements are positive
which is enough to have everything well defined.
Formally, for example,
[TABLE]
ϕα:Rl→R is given by
[TABLE]
and Iα is now defined to be the set of integers i∈{1,…,l} such that there exists
ui∈LUα with ui>0. Using almost the same proof as the one showing the
equality of the two maxima in (3.2), we
get α−1(I)=Iα, where I is defined as
before. There is no need to redefine the
various cases a), b1), b2) here since they coincide with those
previously defined when taking pα(1)∙,…,pα(l)∙ instead of
p1∙,…,pm∙. For example, "there exists u∈Uα maximizing
ϕα over Uα such
that pα(1)Xu1+⋯+pα(l)Xul=1 and pα(1)Yu1+⋯+pα(l)Yul<1" is equivalent to Case a) defined in
Section 1.3.
Finally, the function m defined in Lemma 1.5 can be extended naturally to (Rl)2.
Within this generalized setting, the proof of Lemma 2.2 carries over, giving us the
following theorem for, LCnα, the length of the longest common
subsequences with blocks α(1),…,α(l).
Theorem 3.1**.**
Let BX and BY be two independent ∣I∣-dimensional Brownian motions defined on
[0,1] with respective covariance matrix CX defined by Ci,iX=1 and Ci,jX=−(1−pα(i)X)(1−pα(j)X)pα(i)Xpα(j)X, for i=j in I, and CY defined in a similar fashion. For all λ∈KΛ2α and i∈Iα, set
[TABLE]
If there exists u∈LUα such that pα(1)Xu1+⋯+pα(l)Xul=1
and pα(1)Yu1+⋯+pα(l)Yul<1, or equivalently if there
exists u∈LU such that p1Xu1+⋯+p1Xum=1
and p1Yu1+⋯+pmYum<1 (Case a)), then
[TABLE]
If for all u∈LUα, pα(1)Xu1+⋯+pα(l)Xul=1 and pα(1)Yu1+⋯+pα(l)Yul=1, or equivalently if
for all u∈LU, p1Xu1+⋯+p1Xum=1
and p1Yu1+⋯+pmYum=1 (Case b)), then
[TABLE]
where, again, now m is defined on (Rl)2.
For instance, for m=2 and in the uniform case, the order
α(1)=2,α(2)=1,α(3)=2 gives the limiting distribution:
[TABLE]
i.e.,
[TABLE]
Also note that, sometimes, the limit in the above theorem
is simply a normal random variable.
Indeed, take p1X=1/3,p2X=2/3,p1Y=1/4,p2Y=3/4, and
α(1)=1,α(2)=2, then we
are in Case a), I={2} and:
[TABLE]
This is also, as one would expect, the limiting distribution of the number of 2’s in the first word (which is almost equal to LCnα). However, if we take α(1)=2,α(2)=1,α(3)=2,
the limit is more involved.
For b∈N such that b≥m, let now Fmb denote the set of all
surjections from {1,…,b} to {1,…,m}, and let LCn(b) be the length of the longest common subsequences with b≥m blocks, with for each letter at least one block of this letter, and still allowing the blocks to have size zero. This is nothing but the maximum, over all the possible α∈Fmb, of LCnα, so, recalling the discussion preceding the statement of Theorem 3.1, we have:
Theorem 3.2**.**
In Case a),
[TABLE]
In Case b),
[TABLE]
Proof.
The proof of this theorem follows lines of
the proof of our previous main result, considering
pα(i)∙ instead of pi∙.
∎
Note that LCn, the length of the longest common subsequences without any conditions on blocks, corresponds to LCn(n+m) (or to be more precise, LCn(b) for any b≥m+n−2: this is because when, say, there are only two kind of letters involved in the longest common word, we have to take m−2 additional empty blocks to make α onto).
Although the above theorem requires a fixed number of blocks, say, b, it is nevertheless
noteworthy that no matter this fixed number,
[TABLE]
3.3 Countably infinite alphabet
To continue, let us consider, as in [5, Section 4], the generalization to
countably infinite alphabets. Let the alphabet be N∗={1,2,…},
let (pi′X)i≥1 and (pi′Y)i≥1 be two probability mass functions on this
alphabet, we are now
interested in LCIn∞, the length of the longest common and increasing subsequences
over this countably infinite alphabet.
Let
[TABLE]
and let
[TABLE]
Let m∈N,m≥2 be such that ∑i=m+∞pi′X<emax∞
and ∑i=m+∞pi′Y<emax∞. Let us consider the distributions over {1,…,m} obtained by replacing all the letters greater or equal to m by m, namely, let piX=pi′X for i<m and pmX:=∑i=m+∞pi′X, and let piY, 1≤i≤m, be defined in a similar fashion. Let now LCIn be the length
of the longest increasing subsequences formed by
replacing all the letters greater or equal to m by m, i.e., the longest
common and increasing
subsequences on {1,…,m} associated with the probability mass functions p′X
and p′Y. Next we argue, via a sandwiching argument,
that when properly centered and scaled (note that emax∞=emax),
LCIn∞ and LCIn tend to the same limit.
Indeed, let LCIn∗ be
the length of the longest common and increasing subsequences not using the letter m,
i.e., the length of the longest common and increasing subsequences on {1,…,m−1} associated with the probability mass functions p′X
and p′Y or, equivalently,
pX and pY.
Since m∈/I (where I is defined with the distribution
(piX)1≤i≤m and (piY)1≤i≤m),
(LCIn∗−nemax)/n and (LCIn−nemax)/n converge to the same limiting distribution.
But,
[TABLE]
completing the proof.
From the proofs presented above, the passage from two to three or more sequences is
clear: the minimum over two Brownian functionals becomes a minimum over three or more
Brownian functionals, and such a passage applies to the cases touched upon above and below.
Throughout the text, the two sequences (Xk)k≥1 and (Yk)k≥1 are assumed to be independent
with respective i.i.d. components. In view of [6] or [3], one expects that the i.i.d. assumption
could be replaced by a Markovian one or even a hidden Markovian one.
Moreover, one further expects that the independence of the two sequences
is unnecessary and that a potential dependence structure
between the two sequences would carry over to corresponding 2m-dimensional Brownian functionals,
another case at hand could be the hidden Markov framework. Finally, it should also be of interest
(as already done in [2] for uniform letters) to study the ramifications/connections
of our results with last passage percolation.
Define fν:E′2→R by
fν:x↦∑i=1m[(piXxiX+νiX)∧(piYxiY+νiY)]. In order to prove that m(ν) is well defined and (1.20), it is enough to prove that for all x∈E′2, there exists x′∈E′2 such that ∥x′∥∞≤2Cm∥ν∥∞ and fν(x′)≥fν(x).
Let x∈E′2. Firstly, assume that x∈P
(recalling (2.13)). If fν(x)<fν(0), taking x′=0 works, so assume
fν(x)≥fν(0). By (1.10) (applied twice),
[TABLE]
hence −f(x)≤2m∥ν∥∞ and, by Lemma 2.3, there exists xK∩P∈K∩P and xr∈E2 such that x=xK∩P+xr and ∥xr∥∞≤−Cf(x)≤2Cm∥ν∥∞.
But from the definition of K, fν(xK∩P+xr)=f(xK∩P)+fν(xr), and by \eqrefpropkp, f(xK∩P)=0 so fν(x)=fν(xr). Moreover, since x∈P and
xiK∩P,∙=0 for all i∈Ic, xr∈E′2.
Now, if we do not assume x∈P anymore, observe that for ε>0 small enough,
εx∈P, so fεν(x′)≥fεν(εx) for some x′∈E′2 such that ∥x′∥∞≤2Cm∥εν∥∞. Finally, dividing by ε,
fν((1/ε)x′)≥fν(x) where ∥(1/ε)x′∥∞≤2Cm∥ν∥∞.
In Case b1), let us begin with the subcase I={1}. In this instance, p1X=p1Y=emax, while for all 1<i≤m, piX<emax or piY<emax (otherwise i would be in I).
We now show that “the maximum of fν is realized with the first letter plus one other letter”, more precisely, there exists x∈E′2 such that fν(x)=m(ν) and ∣{i∈{2,…,m}:xiX=0 or xiY=0}∣≤1. Indeed, using the same method than in the proof of Lemma 1.6, keeping in mind ν2∙=⋯=νm∙=0, one can see that there exists some x maximizing fν such that {i∈{1,…,m}:xiX=0 or xiY=0} has at most two elements, and they can’t both belong to {2,…,m} otherwise they would be null (by the definition of E′).
Returning to the proof of the lemma, we have shown that
[TABLE]
Fixing i0∈{2,…,m}, we have
[TABLE]
It is then easily seen that this last supremum does not change with the additional condition pi0XtX=pi0YtY. (Indeed, if, for example, pi0XtX>pi0YtY, reducing tX to transform
this strict inequality into equality will only increase the sum of the two minima
in the definition of fν.) Hence,
[TABLE]
Since i0∈/I, it is impossible for both pi0X−emax and pi0Y−emax to be positive, so this last supremum is attained at tX=0 (and is equal to ν1X∧ν1Y)
unless
ν1X<ν1Y and pi0X−emax>0, or
ν1X>ν1Y and pi0Y−emax>0, in which case the supremum is attained at
tX=emaxpi0Ypi0X−pi0Yν1Y−ν1X,
a value at which the two sides in the above minimum are equal
to each other. So if ν1X<ν1Y and pi0X−emax>0, or ν1X>ν1Y
and pi0Y−emax>0, then
[TABLE]
Assuming that ν1X<ν1Y, we see that in this case
m(νX,νY)=sXSY+tXSX. This remains true if SX=SY (in this case,
m(νX,νY)=SX=SY), and, similarly, when SY≤SX.
The proof of Case b1) is therefore done when I={1}.
Still in Case b1), but without the assumption that I={1}, assume, without loss of generality,
that I={1,…,k}, k≥2.
Define ν~ by ν~1∙=S∙ and ν~i∙=0, for all
i≥2. Let x0∈E′2 be defined by x0,Y=0, x10,X=(SX−SY+ν1Y−ν1X)/emax, xi0,X=(νiY−νiX)/emax, for all i∈{2,…,k}, and xi0,∙=0 for all i∈{k+1,…,m}. Note that for all x∈E′2, fν(x+x0)=fν~(x),
so m(ν)=m(ν~). Moreover, defining x′ via x1′∙=x1∙+⋯+xk∙, xi′∙=0, for i∈{2,…,k}, and xi′∙=xi∙ everywhere else, we have x′∈E′2, and
[TABLE]
[TABLE]
Hence, fν~(x′)≥fν~(x), and therefore
[TABLE]
Now applying the subcase I={1} concludes the proof of Case b1).
In Case b2), again assume without loss of generality that I={1,…,k}, k≥2.
Let L1=(1,0,…,0,−1,0,…,0)∈R2k, having
k−1 zeros between the two non-zero coordinates,
let L2=(0,1,0,…,0,−1,0,…,0) (still with k−1 zeros between
the two non-zero coordinates), and iterate this process up to Lk.
Let also PX be the concatenation of PX∈Rk with
0∈Rk, and let PY be the concatenation
of 0∈Rk with PY∈Rk.
The vectors L1,…,Lk,PX,PY are linearly independent since,
as already seen in Lemma 1.4, PX and PY are linearly independent.
Now, let Q be a 2k×2k invertible matrix with first rows
L1,…,Lk,PX,PY (for example, to form
such a matrix Q, one could complete the first columns with vectors from
the canonical basis), let Δ∈R2k be defined by
[TABLE]
and let u∈R2k be defined by
[TABLE]
We have uiX−uiY=νiY−νiX (where uX is the vector of the first k coordinates
of u and uY the vector of the last k coordinates of u) for all i∈{1,…,k} : these conditions stem from the rows L1,…,Lk. Moreover, u1X/p1X+⋯+umX/pmX=u1Y/p1Y+⋯+umY/pmY=0 (conditions stemming from the rows PX,PY). Then, expand uX and uY to Rm by filling with zeros, so that u:=(uX,uY) is now in (Rm)2.
Setting, for all i∈{1,…,m},
yiX:=uiX/piX,yiY:=uiY/piY, lead to y∈(Rm)2, more precisely y∈E′2 such that for all i∈{1,…,m},piXyiX+νiX=piYyiY+νiY,
with moreover
[TABLE]
Setting UX:=(uiX)i∈I∈Rk, UY:=(uiY)i∈I, RX:=(νiX)i∈I and RY:=(νiY)i∈I,
the above expression becomes
This shows that maxx∈E′2∑i=1m[(piXxiX+νiX)∧(piYxiY+νiY)]≥∑i∈I(sνiX/piX+tνiY/piY). Now let x∈E′2,
[TABLE]
We have x−y∈E′2 (recall, also, that yi=0 for all i∈Ic), so for some c>0, (x−y)/c∈P, and then f((x−y)/c)≤0, so f(x−y)≤0.
Hence ∑i=1m[(piXxiX+νiX)∧(piYxiY+νiY)]−∑i∈I(sνiX/piX+tνiY/piY)≤0 and, finally,
maxx∈E′2∑i=1m[(piXxiX+νiX)∧(piYxiY+νiY)]=∑i∈I(sνiX/piX+tνiY/piY).
∎
Bibliography8
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] F. Benaych-Georges and C. Houdré. GUE minors, maximal Brownian functionals and longest increasing subsequences in random words. Markov Processes. Related Fields 21 (2015), 109-126.
2[2] J.-C. Breton and C. Houdré. On the limiting law of the length of the longest common and increasing subsequences in random words. Stochastic Process. Appl. 127 (2017), 1676–1720.
3[3] C. Houdré and G. Kerchev. On the rate of convergence for the length of the longest common subsequences in hidden Markov models. J. Appl. Probab. 56 (2019), no. 2, 558–573
4[4] C. Houdré, J. Lember and H. Matzinger. On the longest common increasing binary subsequence. C.R. Acad. Sci., Paris Ser. I 343 (2006), 589–594.
5[5] C. Houdré and T. J. Litherland. On the longest increasing subsequence for finite and countable alphabets. High Dimensional Probability V: The Luminy Volume (2009), 185-212.
6[6] C. Houdré and T. J. Litherland. On the limiting shape of Young diagrams associated with Markov random words. Markov Processes. Related Fields 26 (2020), 779-838.
7[7] M. Kiwi, M. Loebl and J. Matoušek. Expected length of the longest common subsequence for large alphabets. Adv. Math. 197 (2005), 480–498.
8[8] Y. Zhang Topics on the length of the longest common subsequences with blocks in binary random words. Ph D dissertation, Georgia Institute of Technology (2019).