Improvements of Plachky-Steinebach theorem
Henri Comman*†*
Pontificia Universidad Católica de Valparaiso, Avenida Brasil 2950, Valparaiso, Chile
[email protected]
Abstract.
We show that the conclusion of Plachky-Steinebach theorem holds true
for intervals of the form ]Lr′(λ),y[, where Lr′(λ) is the right derivative (but not necessarily a derivative) of the generalized log-moment generating function L at some λ>0 and y∈ ]Lr′(λ),+∞],
under the only two following conditions: (a) Lr′(λ) is a limit point of the set {Lr′(t):t>λ}, (b) L(ti) is a limit with ti belonging to some decreasing sequence converging to sup{t>λ:L∣]λ,t] is affine}.
By replacing Lr′(λ) by Lr′(λ+),
the above result extends verbatim to the case λ=0 (replacing (a) by the right continuity of L at zero when
Lr′(0+)=−∞). No hypothesis is made on L]−∞,λ[ (e.g. L]−∞,λ[ may be the constant +∞ when λ=0); λ≥0 may be a non-differentiability point of L and moreover a limit point of non-differentiability points of L; λ=0 may be a left and right discontinuity point of L. The map L∣]λ,λ+ε[ may fail to be strictly convex for all ε>0. If we drop the assumption (b), then the same conclusion holds with upper limits in place of limits.
Furthermore, the foregoing is valid for general nets (μα,cα) of Borel probability measures and powers (in place of the sequence (μn,n−1)) and replacing the intervals ]Lr′(λ+),y[ by ]xα,yα[ or [xα,yα], where (xα,yα) is any net such that (xα)
converges to Lr′(λ+) and liminfαyα>Lr′(λ+).
2000 Mathematics Subject Classification:
Primary: 60F10
† Partially supported by FONDECYT grant 1120493.
1. Introduction and statement of the results
Let (μα,cα) be a net where μα is a
Borel probability measure on R, cα>0 and (cα) converges to zero. Let L
be the generalized log-moment generating function associated with (μα,cα), defined
for each t∈R by
[TABLE]
when the above upper limit is a limit we write L(t) in place of L(t). Let λ>0 and assume that the derivative map L′ exists and is strictly monotone on a neighbourhood of λ (in particular, L(t) exists in R for all t in this neighbourhood). When (μα,cα) is a sequence of the form (μn,n−1), Plachky-Steinebach theorem ([10]) asserts that for each sequence (xn) of real numbers converging to L′(λ) we have
[TABLE]
(Note that L(λ)−λL′(λ)=−L∗(L′(λ))=inft>0{L(t)−tL′(λ)}, where L∗ denotes the Legendre-Fenchel transform of L.)
In this paper, we improve the above result in various ways: First, we weaken all the hypotheses by
(a) replacing the differentiability of L on a neighbourhood of λ by the condition that Lr′(λ) is a limit point of the set {Lr′(t):t>λ}, where Lr′ denotes the right derivative map of L, (b)
removing entirely the strict convexity hypothesis, (c) allowing λ=0 (with the only hypothesis of right continuity of L at zero when Lr′(0+)=−∞),
(d) requiring the existence of L only on a suitable sequence, (e) allowing general nets (μα,cα,xα) in place of the sequence (μn,n−1,xn); second, we strengthen the conclusion by allowing a more general class of intervals in the left hand side of (2); third, we generalize all the above by giving a version with upper limits.
For each λ≥0, our result covers all situations except when the condition mentioned in (a) or in (c) fails, i.e. excluding the following ones (cf. Lemma 1):
The map Lr′∣]λ,+∞[ takes only infinite values (equivalently, either L∣[0,+∞[ is improper, or L(t)=+∞ for all t>λ).
There exists T∈ ]λ,+∞] such that L∣]λ,T[ is affine, with Lr′(λ+)<Lr′(T)<+∞ when T<+∞.
λ=0, Lr′(0+)=−∞ and L is right discontinuous at zero.
(In particular, in the most common case where L[λ,λ+ε[ is real-valued and continuous for some ε>0, only (ii) is excluded.)
Equivalently, Theorem 1 applies in (and only in) all the cases
where Lr′(λ+)=limiLr′(λi) for some sequence (λi) fulfilling eventually
[TABLE]
with L(0+)=0 when λ=0 and Lr′(0+)=−∞ (note that the above condition is satisfied under the hypotheses of Plachky-Steinebach theorem); then, there is
a unique real number λ~≥λ fulfillling Lr′(λ+)=Lr′(λ~+) and Lr′(t)>Lr′(λ+) for all t>λ~; in fact,
λ~=sup{t>λ:L∣]λ,t] is affine}=limiλi for every sequence as above (cf. Lemma 1).
Let us emphasize that no hypothesis is made on the map L∣]−∞,λ[.
The proof deals only with the properties of the right derivative map Lr′∣[λ,λ+ε[;
in particular, the following special situations may arise:
λ may be a non-differentiability point of L, and moreover a limit point of non-differentiability points of L
(e.g. when λ=λ~ and eventually λi is a non-differentiability point of L).
L may be left discontinuous at λ when λ=0 (which is the case when Lr′(0)=−∞) and moreover right discontinuous at [math] (when Lr′(0)=−∞<Lr′(0+)); cf. Remark 4.
L∣]λ,λ+ε[ may not be strictly convex for all ε>0 (e.g. when λ=λ~ and eventually L∣[λi+1,λi] is affine); when λ<λ~, L is not strictly convex in any neighbourhood of λ or λ~.
As regards the strong version with limits, for each t<λ, L(t) may not be a limit.
The foregoing contrasts sharply with the hypotheses of Plachky-Steinebach theorem, which require λ>0 and some ε>0 such that L(t) exists in R for all t∈ ]λ−ε,λ+ε[ and L∣]λ−ε,λ+ε[ is strictly convex and differentiable.
Theorem 1 111Some words of caution about the notations: First,
the net (μα) is considered as a net of Borel measures on [−∞,+∞] in order to give a sense to μα([xα,yα]) when xα=−∞ or yα=+∞; second, when Lr′(λ+)=−∞, we adopt the convention 0⋅Lr′(λ+)=0; third, in order to simplify the wording of the theorem,
we do not distinguish with the notations between the cases λ>0 and λ=0, so that we use Lr′(λ+) and L(λ+), although Lr′(λ+)=Lr′(λ) and L(λ+)=L(λ) when λ>0, and Lr′(λ+)=−∞ implies λ=0 (cf. Lemma 2).
below constitutes the first general result giving the upper limit (and a fortiori, the limit) of
(cαlogμα([xα,yα])) or (cαlogμα(]xα,yα[) in terms of L∗, when limαxα is not a value of the derivative map of the generalized log-moment generating function (cf. Remark 5).
Theorem 1**.**
Let λ≥0. When Lr′(λ+)>−∞ we assume that Lr′(λ+) is a limit point of the set {Lr′(t):t>λ}; when Lr′(λ+)=−∞ we assume that L(λ)=L(λ+). For each net (xα,yα) in [−∞,+∞]2 such that (xα) converges to Lr′(λ+) and liminfαyα>Lr′(λ+) we have
[TABLE]
Furthermore, if L(ti) exists for a sequence (ti) in ]λ~,+∞[ converging to λ~, then the above upper limits are limits (with λ~=sup{t>λ:L∣]λ,t] is affine}).
It is possible that for some ε>0, λ is the only point in [0,λ+ε]
where
Theorem 1 applies
(e.g. when λ=λ~, L is not differentiable at λ, and eventually λi appearing in (3) is a non-differentiability point of L and
L[λi+1,λi] is affine; although this may seem to be an extreme case, we show in Appendix A that there are plenty of such examples); in the light of the above,
Theorem 1 may be thought of as a pointwise version of Plachky-Steinebach theorem.
Hereafter, we focus on the opposite situation, motivated by the following observation:
If L exists as a differentiable and strictly convex map on some open interval containing λ>0 and zero (says ]−ε,λ+ε[), then the hypotheses of Plachky-Steinebach theorem
hold for all t∈ ]0,λ+ε[ (in place of λ), and thus it applies to every sequence converging to L′(t) for all t∈ ]0,λ+ε[; since
L′∣]0,λ+ε[ is in this case an increasing homeomorphism,
the limit obtained, as a function of L′(t), is exactly −L∗∣]Lr′(0),Ll′(λ+ε)[ (where Ll′ denotes the left derivative map of L);
a weak version of this (with the constant sequence (xn)≡L′(t)) has been
widely used in the context of dynamical systems
(e.g. [8], Lemma 13.2; [2], Lemma 6.2; [3], Corollary 4.3; [4], Theorem 2.2; [1], Proposition 7).
A much stronger result can be derived from Theorem 1, as shows the
following Corollary 1;
aside the extension with upper limits and the possibility to consider nets in place of sequences, the improvements are obtained first, by allowing a more general class of intervals in the conclusion, and second, by weakening the hypotheses as follows:
The differentiability of L∣]−ε,λ+ε[ is replaced by the differentiability of L∣]λ,λ+ε[.
The strict convexity of L∣]−ε,λ+ε[ is replaced by the condition that
L′∣]λ,λ+ε[ does not attain its supremum (equivalently, for each t∈ ]λ,λ+ε[ the map L∣[t,λ+ε[ is not affine).
(Note first, that this condition is obviously fulfilled when
L∣]t,λ+ε[ is strictly convex for some t∈[λ,λ+ε[, and second,
it is far weaker than strict convexity: There may be an infinite countable set {Si:i∈N} of non-empty mutually disjoint intervals included in ]λ,λ+ε[ on which L is affine; when supisupSi=λ+ε,
the above condition implies supSi<λ+ε for all i∈N.)
The possibility to take λ=0, including the cases L′(0+)=−∞ and L′(0)=−∞<L′(0+).
The version with limits only requires the existence of L on a sequence (ti) in ]t~,+∞[ converging to t~, for all t∈[λ,λ+ε[; note that the set [λ,λ+ε[∖{t~:t∈[λ,λ+ε[} contains all the open intervals on which L is affine.
Corollary 1**.**
*Let λ≥0 and let ε>0. We assume that L∣]λ,λ+ε[ is differentiable and sup{L′(t):t∈ ]λ,λ+ε[} is not a maximum; when λ=0 and Lr′(0+)=−∞, we further assume that L is right continuous at zero.
For each z∈[Lr′(λ+),Ll′(λ+ε)[,
there exists tz∈[λ,λ+ε[ fulfilling Lr′(tz+)=z, and for each such tz, for each net (xα,z) in [−∞,+∞] converging to z, and for each net
(yα,z) in [−∞,+∞] fulfilling liminfαyα,z>z, we have
[TABLE]
Furthermore, if L(tz,i) exists for a sequence (tz,i) in ]tz~,+∞[ converging to tz~, then the above upper limits are limits (with tz~=sup{t>tz:L∣]tz,t] is affine}).
The map [Lr′(λ+),Ll′(λ+ε)[∋z→L(tz+)−tzz is
]−∞,0]-valued,
strictly decreasing, continuous, unbounded below if and only if Ll′(λ+ε)=+∞, and vanishes at Lr′(λ+) if and only if either λ>0 and L is differentiable at λ and linear on [0,λ], or
λ=0 and L is right continuous at zero; its restriction to
]Lr′(λ+),Ll′(λ+ε)[ is strictly concave, and it is
furthermore differentiable when
L∣]λ~,λ+ε[ is strictly convex (with λ~=sup{t>λ:L∣]λ,t] is affine}).
Remark 1*.*
We have Lr′(λ+)=−∞ only if λ=0, in which case
the hypothesis of right continuity of L at zero in Theorem 1 (and Corollary 1) cannot be removed; indeed, suppose that L(0+)=0, and let c∈ ]0,1[.
Since for each index α the measure μα is tight, there exists a net (xα) of real numbers converging to −∞ such that μα([xα,−xα])>c, hence
[TABLE]
and the conclusion does not hold with (xα). The hypothesis for the case Lr′(λ+)>−∞ also holds when Lr′(λ+)=−∞ (cf. Lemma 1b)).
Remark 2*.*
The upper limits in Theorem 1 vanish if and only if one of the following cases occurs:
λ>0, L is differentiable at λ and linear on [0,λ];
λ=0 and L is right continuous at zero.
Remark 3*.*
Under the hypotheses of Theorem 1,
the equality L∗(Lr′(λ+))=limαL∗(xα) fails when eventually
xα does not belong to the effective domain of L∗, which happens if and only if L∣]−∞,λ[ is linear with slope Lr′(λ+) (equivalently, Lr′(λ+) is the left end-point of the effective domain of L∗)
and eventually xα<Lr′(λ+).
Remark 4*.*
Regarding the proof as well as the wording of Theorem 1, the main difference between the cases λ>0 and λ=0 stems from the fact that when the map L∣[0,+∞[ is proper (which is implied by the hypotheses, cf. Lemma 1), [math] is the only nonnegative real number λ in the effective domain of L∣[0,+∞[ for which:
Lr′(λ) or Lr′(λ+) may take the value −∞;
Lr′(λ) may differ from Lr′(λ+);
Lr′(λ+)<+∞ and L may be right discontinuous at λ.
(cf. Lemma 2 and Lemma 3; for instance, the conclusion in the last assertion of Lemma 2 is not true for λ=0). Theorem 1 and Corollary 1
include the case λ=0 and Lr′(0)=−∞<Lr′(0+), which implies the left and right discontinuities of L at zero (cf. Lemma 3); the case λ=0 and Lr′(0+)=−∞ implies the left discontinuity of L at zero (in both cases, we have L]−∞,0[≡+∞).
Remark 5*.*
The standard version of Gärtner-Ellis theorem is unworkable when L is not differentiable at λ: Indeed,
the main hypothesis (i.e. essentially smoothness) implies the differentiability of L on the interior of its effective domain
(beside requiring the existence of L on an open interval containing [math]);
the same applies to the variant of Gärtner-Ellis theorem given by Theorem 5.1 of [9] (which allows L to exists only on some open interval not necessarily containing [math]) since it also requires the essential smoothness.
Corollary 1 of [5] strengthens both above versions, but, although weaker than essential smoothness, the general hypothesis
is still a global condition requiring the existence of L on some open interval and
relating the range of the one-sided derivatives of L with the effective domain of L∗, and thus it is of no use here.
2. Proofs
Recall that L is a [−∞,+∞]-valued convex function ([7], [6]), and such a function is said to be proper when it is ]−∞,+∞]-valued and takes at least one finite value; note that
L(0)=0. We denote by L∗ and Lr′ (resp. Ll′) respectively the Legendre-Fenchel transform and right (resp. left) derivative map of L, where for each t≥0 we put Lr′(t)=+∞ (resp. Lr′(t)=−∞) when L(s)=+∞ for all s>t (resp. L(s)=−∞ for some s>t); note that L∗ is [0,+∞]-valued since L(0)=0.
The basic properties of the map Lr′∣]0,+∞[ are summarized in Lemmas 1a), Lemma 2 and Lemma 3;
in particular, Lr′∣]0,+∞[ is non-decreasing so that the quantity Lr′(λ+)=limt→λ,t>λLr′(t) is well-defined for all λ≥0.
Let λ≥0. All the lemmas below hold for any net (μα,cα) as in (1); the hypotheses of Theorem 1 are made only in the last part of the proof.
Lemma 1**.**
The map L∣[0,+∞[ is improper if and only if there exists T∈ ]0,+∞] such that L∣]0,T[=Lr′∣[0,T[≡−∞ and L∣]T,+∞[=Lr′∣[T,+∞[≡+∞.
If L∣[0,+∞[ is proper, then
one and only one of the following cases holds:
There exists a sequence (Lr′(λi)) in ]Lr′(λ+),+∞[ converging to
Lr′(λ+).
There exists ε>0 such that L∣]λ,λ+ε] is affine and Lr′(λ+)<Lr′(λ+ε)<+∞.
L∣]λ,+∞[* is affine.*
L(λ)∈R* and L(t)=+∞ for all t>λ;*
L(t)=+∞* for all t≥λ.*
In particular, (i) or (ii) or (iii) holds if and only if there exists ε>0 such that L∣[λ,λ+ε] is real-valued, bounded, and L∣]λ,λ+ε] (resp. L∣[λ,λ+ε] when λ>0) continuous.
Furthermore, (i) holds if and only if there exists λ~∈[λ,+∞[ fulfilling
Lr′(λ+)=Lr′(λ~+) and Lr′(t)>Lr′(λ+) for all t>λ~; such a λ~ is unique and given by
[TABLE]
for every sequence (λi) as in (i).
Proof.
a) It is a direct consequence of the definitions together with the convexity of L and the fact that L(0)=0.
b) Assume that L∣[0,+∞[ is proper.
If L(λ)=+∞, then L(t)=+∞ for all t≥λ (because L is convex and L(0)=0) and (v) holds.
Assume now that L(λ)<+∞. Then
L(λ)∈R because L∣[0,+∞[ is proper.
Since L is convex and L(0)=0, we have either L(t)=+∞ for all t>λ and (iv) holds, or
L∣[λ,λ+δ[ is real-valued for some δ>0, in which case
Lr′(t)∈R for all t∈ ]λ,λ+δ[.
Assume that (i) does not hold, i.e. Lr′(λ+)<inf{Lr′(t)∈ ]Lr′(λ+),+∞[:t>λ}.
If {Lr′(t)∈ ]Lr′(λ+),+∞[:t>λ}=∅, then Lr′(t)=Lr′(λ+) for all t>λ, and (iii) holds. If {Lr′(t)∈ ]Lr′(λ+),+∞[:t>λ}=∅, then
there exists ε∈ ]0,δ[ such that Lr′(λ+)<Lr′(λ+ε) and
Lr′(λ+)=Lr′(t) for all t∈ ]λ,λ+ε[, so that (ii) holds. The first assertion is proved.
Assume that either (i), (ii), or (iii) holds. There exists ε>0 such that
Lr′∣]λ,λ+ε[ is real-valued, hence L∣[λ,λ+ε[ is real-valued.
The map
L∣]λ,λ+ε/2] (resp. L∣[λ,λ+ε/2] when λ>0) is continuous since ]λ,λ+ε/2] (resp. [λ,λ+ε/2] when λ>0)
belongs to the interior of the effective domain of L∣[0,+∞[.
The boundness of
L∣[λ,λ+ε/2] follows from the continuity when λ>0, and the boundness of
L∣]λ,λ+ε/2] when λ=0 follows
from the convexity and the fact that L(0)=0.
Conversely, if L∣[λ,λ+ε[ is real-valued for some ε>0, then the first assertion
implies that
either (i), (ii), or (iii) holds.
The second assertion is proved.
Assume that there exists
s∈[λ,+∞[ fufilling Lr′(λ+)=Lr′(s+) and Lr′(t)>Lr′(λ+) for all t>s.
We have Lr′(λ+)=limt→s,t>sLr′(t)
hence (i) holds.
Assume that (i) holds. Put λ~=sup{t>λ:L∣]λ,t] is affine}.
If λ~=+∞, then Lr′(t)=Lr′(λ+) for all t>λ, which contradicts (i), hence λ~∈[λ,+∞[.
Let t>λ~. Since L∣]λ,t] is not affine, we have Lr′(t)>Lr′(λ+).
Suppose that Lr′(λ+)<Lr′(λ~+); by (i) there exists Lr′(λi) such that
[TABLE]
which implies λ<λi<λ~ and
contradicts the fact that L]λ,λ~[ is affine; therefore, Lr′(λ+)=Lr′(λ~+).
Let (λi) be a sequence as in (i). For each ε>0 we have eventually
[TABLE]
hence eventually
λ~<λi<λ~+ε, which implies
λ~=limiλi.
The equalities Lr′(λ+)=Lr′(λ~+)=Lr′(s+) implies L∣]λ,s] affine, hence s≤λ~ by definition of λ~. Since Lr′(t)>Lr′(λ+)=Lr′(λ~+) for all t>s, we have Lr′(s+)≥Lr′(λ~+) hence s≥λ~;
therefore, s=λ~. The proof of the third assertion is complete.
∎
Lemma 2**.**
The following statements are equivalent:
L∣[0,+∞[* is proper;*
Lr′∣]0,+∞[* is ]−∞,+∞]-valued, non-decreasing and right continuous.*
When the above holds and λ>0, we have Lr′(λ)∈R if and only if L(t)<+∞ for some t>λ.
Proof.
(ii)⇒(i) follows from the definitions since Lr′ takes the value −∞ when L∣[0,+∞[ is improper (cf. Lemma 1).
Assume that (i) holds.
Since L is convex and L(0)=0, the map Lr′∣]0,+∞[ is ]−∞,+∞]-valued. Let λ>0.
If Lr′(λ)=+∞, then λ fulfils one of the last two cases of Lemma 1, so that
Lr′ takes only the value +∞ on [λ,+∞[ hence is non-decreasing and right continuous on
[λ,+∞[.
Assume that Lr′(λ)<+∞. Then λ fulfils one of the first three cases of
Lemma 1; in particular, there exists ε>0 such that L∣[λ,λ+ε] is real-valued and continuous.
Extend L∣[λ,λ+ε] to a lower semi-continuous function f on R by putting f(x)=+∞ for all x∈R∖[λ,λ+ε]. The right derivative map of f is non-decreasing and right continuous on R by
Theorem 24.1 of [11] so that Lr′ is non-decreasing on [λ,λ+ε] and right continuous on [λ,λ+ε[;
therefore, (ii) holds and the first assertion is proved. If L(t)<+∞ for some t>λ, then Lr′(λ)<+∞ (by definition) hence Lr′(λ)∈R since λ>0 and L∣[0,+∞[ is proper; the converse is obvious.
∎
Lemma 3**.**
The following statements are equivalent:
Lr′* is not right continuous at [math];*
L∣[0,+∞[* is not lower semi-continuous at [math], Lr′(0)=−∞ and Lr′(0+)>−∞.*
L∣[0,+∞[* is not lower semi-continuous at [math] and Lr′(0+)∈R.*
Lr′(0)=−∞* and Lr′(0+)>−∞.*
The above equivalence hold verbatim replacing Lr′(0+)>−∞ by Lr′(0+)∈R. In particular, Lr′(0)∈R if and only if Lr′(0)=Lr′(0+)∈R.
Proof.
Assume that (i) holds. By Lemma 1a) the map L∣[0,+∞[ is proper, and by Lemma 1b) there exists ε>0 such that L∣[0,ε] is real-valued and L∣]0,ε] is continuous.
If L∣[0,+∞[ is lower semi-continuous at [math], then the extension of L∣[0,ε] to a lower semi-continuous function on R yields the right continuity of Lr′ at [math] (by Theorem 24.1 of [11]), which gives a contradiction; therefore, L∣[0,+∞[ is not lower semi-continuous at [math]; since L is convex and L(0)=0 it follows that L(0+) exists as a negative number, and consequently, Lr′(0)=−∞. Since Lr′(t)∈R for all t∈ ]0,ε[, and Lr′∣]0,+∞[ is non-decreasing by Lemma 2, Lr′(0+) exists in [−∞,+∞[, hence Lr′(0+)>−∞ (since Lr′ is not right continuous at [math] by hypothesis) and (ii) holds.
Assume that (ii) holds. Then, Lr′(0+)∈R (otherwise Lr′(0+)=+∞ would imply Lr′(0)=+∞ and a contradiction) hence (iii) holds.
Assume that (iii) holds. By Lemma 1 there exists ε>0 such that L∣[0,ε] is real-valued and L∣]0,ε] is continuous,
hence (since L is convex and L(0)=0), L(0+) exists as a negative number, which implies Lr′(0)=−∞, and gives (iv). Since the implication (iv)⇒(i) is obvious, the proof of the first two assertions is complete; the last assertion is a direct consequence of them.
∎
Let l0 be the function defined on R by
[TABLE]
note that l0 is [0,+∞]-valued and lower semi-continuous.
Lemma 4**.**
We have
[TABLE]
Proof.
Since for each real number λ=0 the set {x∈R:a≤eλx≤b} is compact for all (a,b)∈R2 with a≤b, Theorem 1 of [6] yields
[TABLE]
Since l0 is a [0,+∞]-valued function and L(0)=0, the above inequality is true with λ=0 so that
[TABLE]
hence
[TABLE]
∎
Lemma 5**.**
Assume that L∣[0,+∞[ is proper and L(t)<+∞ for some t>λ. There exists ε>0 such that for each t∈ ]λ,λ+ε[ we have
[TABLE]
for some xt∈R.
Proof.
By Lemma 1 there exists ε>0 such that
L∣[λ,λ+ε] is real-valued, hence
[TABLE]
(Lemma 4.3.8 of [7]).
Part (b) of Theorem 1 of [6] yields for each t∈ ]λ,λ+ε] and for each
M large enough
[TABLE]
Let t∈ ]λ,λ+ε[. For each integer n≥1 there exists xn≤M/t such that
[TABLE]
hence xn∈[L(t)/t−1/t,M/t]. Therefore, the sequence (xn) has a subsequence (xnm) converging to some x∈[L(t)/t−1/t,M/t], so that letting n→+∞ in (5) yields
[TABLE]
where the last inequality follows from the lower semi-continuity of l0.
From the above expression and (4), we get L(t)=tx−l0(x), which proves the lemma.
∎
Lemma 6**.**
Assume that L∣[0,+∞[ is proper and L(t)<+∞ for some t>λ. There exists ε>0 such that
[TABLE]
When Lr′(λ+)=−∞, the above equality is true with t=λ, and Lr′(λ+) in place of Lr′(t).
Proof.
By Lemma 1
there exists ε0>0 such that
L∣[λ,λ+ε0] is real-valued bounded and L∣]λ,λ+ε0] continuous. Let t∈ ]λ,λ+ε0[.
For each ε∈[0,λ+ε0−t[ let ∂L(t+ε) denote the set of subgradients of L at t+ε. Note that
[TABLE]
and
[TABLE]
For each ε∈ ]0,λ+ε0−t[, Lemma 5 yields some xt+ε∈R such that
[TABLE]
hence by Lemma 4,
[TABLE]
since L≥L∗∗ we obtain
l0(xt+ε)=L∗(xt+ε);
in particular, xt+ε∈∂L(t+ε).
Putting x=Lr′(t) we have
[TABLE]
hence limε→0xt+ε=x
(because limε→0Lr′(t+ε)=x by Lemma 2) and
[TABLE]
which implies
[TABLE]
where the first inequality follows from the lower semi-continuity of l0, and
the second inequality follows from Lemma 5;
therefore, L(t)=tx−l0(x) hence l0(x)=L∗(x)
since L(t)=tx−L∗(x) (because x∈∂L(t)); this proves the first assertion.
When Lr′(λ+)=−∞, the hypotheses imply Lr′(λ+)∈R and the last assertion follows noting that the above proof works verbatim with t=λ and Lr′(λ+) (resp. L(λ+)) in place of Lr′(t) (resp. L(t)).
∎
Lemma 7**.**
Assume that L∣[0,+∞[ is proper and L(t)<+∞ for some t>0. Let
(ti) be a sequence of positive numbers converging to [math].
We have
[TABLE]
If furthermore Lr′(0+)>−∞, then
[TABLE]
Proof.
By Lemma 1 (applied with λ=0),
there exists ε>0 such that L∣]0,ε] is real-valued, bounded and continuous, hence eventually
Lr′(ti)∈R,
[TABLE]
and L(0+) exists in ]−∞,0]; furthermore, Lr′(0+) exists in [−∞,+∞[ by
Lemma 2. Since L∗ is continuous on its effective domain, and
L(0+) is a non-positive real number, (6) ensures the existence of
limiL∗(Lr′(ti)) in [0,+∞[.
First assume that Lr′(0+)>−∞. Then
Lr′(0+)∈R and (6) yields
[TABLE]
which proves the first assertion and the first equality of the second assertion; we have
[TABLE]
and the second equality of the second assertion follows.
Assume that Lr′(0+)=−∞.
For each i large enough there exists λi>0 such that
λiLr′(ti)=L(λi), hence
[TABLE]
and
[TABLE]
so that the first assertion follows from (6) since
−L(0+)≥0.
∎
Lemma 7 shows that when Lr′(0+)=−∞, the map L∗ extends by continuity to a
[−L(0+),+∞]-valued map on [−∞,+∞[
by putting L∗(−∞)=−L(0+); in what follows, we implicitely use this extension.
Lemma 8**.**
We assume that L∣]λ,λ+ε[ is differentiable for some ε>0.
L′∣]λ,λ+ε[* extends to a non-decreasing
continuous surjection between [λ,λ+ε] and [Lr′(λ+),Ll′(λ+ε)].*
L∗∣[Lr′(λ+),Ll′(λ+ε)[* is [0,+∞[-valued and
extends to a [0,+∞]-valued, strictly increasing and continuous map on
[Lr′(λ+),Ll′(λ+ε)], which takes the value +∞ at Ll′(λ+ε) if and only if Ll′(λ+ε)=+∞;
it vanishes at Lr′(λ+) if and only if either λ>0, L is differentiable at λ and linear on [0,λ], or
λ=0 and L is right continuous at zero. The map L∗∣]Lr′(λ+),Ll′(λ+ε)[ is positive and
strictly convex; if furthermore, λ~<λ+ε and L∣]λ~,λ+ε[ is strictly convex, then L∗∣]Lr′(λ+),Ll′(λ+ε)[ is differentiable (with λ~=sup{t>λ:L∣]λ,t] is affine}).*
Proof.
a) Since L∣]λ,λ+ε[ is differentiable,
the map L′∣]λ,λ+ε[ is continuous by Corollary 25.5.1 of [11]. Since Ll′(λ+ε)=limt→ε,t<εL′(λ+t), the map L′∣]λ,λ+ε[ is a non-decreasing
continuous surjection onto ]Lr′(λ+),Ll′(λ+ε)[, which can be extended by continuity to a non-decreasing
continuous surjection between [λ,λ+ε] and [Lr′(λ+),Ll′(λ+ε)].
b) Each t∈ ]λ,λ+ε[ is a subgradient of L∗ at L′(t) so that L∗∣]Lr′(λ+),Ll′(λ+ε)[ is [0,+∞[-valued and strictly increasing, hence
L∗∣[Lr′(λ+),Ll′(λ+ε)[ is [0,+∞[-valued and strictly increasing; in particular, L∗∣]Lr′(λ+),Ll′(λ+ε)[ is positive.
Since L∗ is lower semi-continuous, it is continuous on its effective domain so that L∗∣]Lr′(λ+),Ll′(λ+ε)[ is continuous, and thus extends to a [0,+∞]-valued, strictly increasing and continuous map on
[Lr′(λ+),Ll′(λ+ε)]; this extended map can takes the value +∞ only at Ll′(λ+ε); since L∗ is lower semi-continuous,
L∗(Ll′(λ+ε)) is finite when Ll′(λ+ε) is finite; conversely, if
Ll′(λ+ε)=+∞, then ]Lr′(λ+),+∞[ is included in the effective domain of L∗, hence limt→λ+ε,t<λ+εL∗(Lr′(t))=+∞.
Let g be the extension of L∣]λ,λ+ε[ to R defined by putting
[TABLE]
so that g is convex, lower semi-continuous, differentiable on the interior of its effective domain, but not sub-differentiable at each point in the complement of the interior of its effective domain; therefore, the Legendre-Fenchel transform g∗ of g
is strictly convex on the interior of its effective domain ([12], Theorem 11.13), hence on
]gr′(λ+),gl′(λ+ε)[.
Since g′(t)=L′(t) for all t∈ ]λ,λ+ε[, we obtain
g∗∣]gr′(λ+),gl′(λ+ε)[=L∣]Lr′(λ+),Ll′(λ+ε)[∗, which proves the strict convexity property.
The first part of the first assertion, and the first part of the second assertion are proved.
The continuity and strict increasingness imply that
inf[Lr′(λ+),Lr′(λ+ε)[L∗ is a minimum, which is attained at the unique point Lr′(λ+); this proves the second part of the first assertion since L∗(Lr′(λ+))=λLr′(λ+)−L(λ+) (using Lemma 7 when λ=0).
Assume furthermore that L∣]λ,λ+ε[ is strictly convex. The map
g is not sub-differentiable at some t∈[λ,λ+ε] if and only if either t=λ=0 and gr′(0+)=−∞, or t=λ+ε and gl′(λ+ε)=+∞, hence
]\lambda,\lambda+\varepsilon[\ \subset\{t\in[\lambda,\lambda+\varepsilon]:g\ \textnormal{is sub-differentiable at t}\}.
Therefore, the map g1[λ,λ+ε]+∞1R∖[λ,λ+ε] is strictly convex on every convex subset of the set \{t\in[\lambda,\lambda+\varepsilon]:g1_{[\lambda,\lambda+\varepsilon]}+\infty 1_{{\mathbb{R}}\setminus[\lambda,\lambda+\varepsilon]}\ \textnormal{is sub-differentiable at t}\}, hence its Legendre-Fenchel transform is differentiable on the interior of its effective domain ([12], Theorem 11.13); consequently, L∣]Lr′(λ+),Ll′(λ+ε)[∗ is strictly convex.
Assume that λ~<λ+ε and L∣]λ~,λ+ε[ is strictly convex. The above proof and the proof of part a) work verbatim replacing ]λ,λ+ε[ by ]λ~,λ+ε[.
Since L is differentiable at λ~, we have Lr′(λ+)=Lr′(λ~+), which proves
the second part of the second assertion.
∎
Proof of Theorem 1.
Put x=Lr′(λ+).
The hypotheses imply that L∣[0,+∞[ is proper
(otherwise, by Lemma 1a), Lr′(t)∈{−∞,+∞} for all t>0, which contradicts the hypothesis when x>−∞, and L∣]0,t[=−∞ for all t>0 small enough, which contradicts the hypothesis when x=−∞), and the case (i) of Lemma 1b) holds; in particular, x<+∞ and there exists ε>0 such that L∣[λ,λ+ε[ is real-valued and L∣]λ,λ+ε] is continuous.
Let (xα,yα) be a net in [−∞,+∞]2 such that limαxα=x and liminfαyα>x. Put y=liminfαyα.
∙ First assertion, the case x∈R: For each t∈ ]λ,λ+ε[,
Chebyshev’s inequality yields eventually
[TABLE]
hence
[TABLE]
and letting t→λ,
[TABLE]
where the last equality follows from Lemma 7 when λ=0.
Suppose that
[TABLE]
The hypothesis together with the continuity of L∗ on its effective domain implies the existence of t>λ and δ>0 such that
[TABLE]
and
[TABLE]
Since eventually
[TABLE]
we have eventually
[TABLE]
hence
[TABLE]
which contradicts (8) since l0(Lr′(t))=L∗(Lr′(t)) by Lemma 6; therefore, we have
[TABLE]
which together with (7) proves the first three equalities of the first assertion; the last equality is obvious when λ>0 (definition of L∗), and follows from Lemma 7 when λ=0.
∙ First assertion, the case x=−∞: Lemma 2 implies λ=0. Let (ti) be a sequence of positive numbers converging to [math], so that
eventually
[TABLE]
by Lemma 6.
Since limiLr′(ti)=−∞, there exists δ>0 such that xα+δ<Lr′(ti)<yα−δ eventually with respect to i and eventually with respect to α, which together with the above equality yields
[TABLE]
where the third equality is given by Lemma 7, and the last equality follows from the hypothesis of
right continuity of
L at zero. The first two equalities of the first assertion follow from the above expression together with Lemma 7 (recall that by convention, 0⋅(−∞)=0).
For
each i∈N, ti is a subgradient of L∗ at Lr′(ti) so that L∗ is non-decreasing on ]−∞,Lr′(ti)];
since limαxα=−∞, eventually xα belongs to the effective domain of L∗ and fullfils
[TABLE]
hence
[TABLE]
letting i→+∞ gives limαL∗(xα)=0, which proves the last two equalities of the first assertion.
The proof of the first assertion is complete.
∙ Second assertion: Let (ti) be a sequence in ]λ~,+∞[ converging to λ~ such that
L(ti) exists for all i∈N. Let (μβ,cβ,xβ,yβ) be a subnet of (μα,cα,xα,yα).
For each t∈R we put
[TABLE]
Since L(μβ,cβ)(ti)=L(ti) for all i∈N,
we have
[TABLE]
hence
[TABLE]
where the last equality follow from Lemma 1b). The inequality L(μβ,cβ)r′(λ+)≤L(μβ,cβ)r′(λ~+) together with (10) implies
[TABLE]
We have Lr′(ti)>Lr′(λ~+) for all i∈N (by definition of λ~) and
[TABLE]
so that
(Lr′(ti)) has a strictly decreasing subsequence (Lr′(tij)) converging to Lr′(λ~+), which implies eventually tij+1<tij<tij−1, hence
[TABLE]
and
[TABLE]
Put λ~(μβ,cβ)=sup{t>λ:L∣]λ,t](μβ,cβ) is affine}.
The inequality L(μβ,cβ)≤L together with (9) implies
[TABLE]
Claim 1**.**
When L(μβ,cβ)r′(λ+)>−∞, the hypothesis of Theorem 1 holds with the net (μβ,cβ) in place of (μα,cα). Furthermore, we have
L(μβ,cβ)(λ+)=L(λ+) and L(μβ,cβ)r′(λ+)=x.
Proof of Claim 1.
Assume L(μβ,cβ)r′(λ+)>−∞. Since x>−∞ by (11) we have
[TABLE]
where the second equality follows from (9) and the last equality follows from (10); therefore,
all the above inequalities are equalities, which gives
[TABLE]
which together with (9) implies
[TABLE]
Since L(μβ,cβ)≤L with L convex, (17) implies
[TABLE]
We have
[TABLE]
hence by (12) and (13), all the above inequalities are equalities, which together with (10) yields
[TABLE]
where the strict inequality follows from (14).
The above expression together with (10) and (18) gives
[TABLE]
which proves the first assertion of the claim.
The first equality of the second assertion is given by (16); the second equality of the second assertion is given by (10) when λ=λ~. Assume λ<λ~. Then, (17) and (18) yield
[TABLE]
Since L is differentiable at λ~ when λ~>λ, we have
[TABLE]
(where (16) is used to obtain the first inequality),
hence by (20) all the above inequalities are equalities, and the
second equality of the second assertion follows.
Claim 2**.**
When L(μβ,cβ)r′(λ+)=−∞, the hypothesis of Theorem 1 holds with the net (μβ,cβ) in place of (μα,cα); in particular, L(μβ,cβ)(λ+)=L(λ+) and L(μβ,cβ)r′(λ+)=x.
Proof of Claim 2.
The hypothesis implies λ=0 and λ~(μβ,cβ)=0, hence λ~=0 by (15); therefore, x=−∞ by (10), which implies L(λ+)=0 by hypothesis of Theorem 1; this last equality together with
(9) yields
L(μβ,cβ)(λ+)=0, which proves the claim.
By Claim 1 and Claim 2 we can apply the first assertion of Theorem 1 with the net (μβ,cβ,xβ,yβ) in place of (μα,cα,xα,yα), which gives
[TABLE]
Since the right hand side of the above last equality does not depend on (μβ,cβ,xβ,yβ), and
(μβ,cβ,xβ,yβ) is an arbitrary subnet of (μα,cα,xα,yα), the second assertion follows.
Proof of Corollary 1.
Let t∈[λ,λ+ε[ and put t~=sup{s>t:L∣]t,s] is affine}
(note that t~ is a maximum since L∣]λ,λ+ε[ is continuous).
Since (by hypothesis) sup{L′(t):t∈ ]λ,λ+ε[} is not a maximum, we have t~<λ+ε.
For each δ>0, the map L∣[t~,t~+δ[ is not affine
(otherwise, since L is differentiable at t~, the slope of L∣[t~,t~+δ[ would be the same as the one of L∣]t,t~], which would contradict the definition of t~). Therefore,
we have L′(s)>L′(t~+) for all s>t~; since L′(t+)=L′(t~+) (because L∣]t,t~] is affine and L is differentiable at t~) we get
L′(s)>L′(t+) for all s>t~. Consequently, by the last assertion of Lemma 1 b),
the case (i) of Lemma 1b) holds, hence the hypotheses of Theorem 1
holds for t (i.e. with λ=t in Theorem 1), and in particular when Lr′(t+)>−∞.
If Lr′(t+)=−∞, then t=λ=0 by Lemma 2;
since in this case L is assumed to be right continuous at zero, the hypotheses of Theorem 1 holds for t. Lemma 8a) ensures that for each
z∈[Lr′(λ+),Ll′(λ+ε)[ there exists tz∈[λ,λ+ε[ such that z=Lr′(tz+),
hence part a) follows applying Theorem 1 for all t∈[λ,λ+ε[.
Part b) follows from Lemma 8 b).
Appendix
Let λ≥0, let ε>0 and let f be a real-valued strictly convex and continuous function on [0,λ+ε] such that f(0)=0. In what follows, we show how from such a function f it is possible to build
a generalized log-moment generating function Lf with effective domain [0,λ+ε], such that Lf(t) is a limit for all t∈R, Lf is not differentiable at λ, Lf∣[0,λ] is affine, λ is a limit of a decreasing sequence (λi)i≥1 of non-differentiability points of Lf∣]λ,λ+ε[ and Lf∣[λi,λi−1] is affine for all i≥1 (with λ0=λ+ε). Therefore, the hypotheses of
Theorem 1 holds for λ, but the usual version of Plachky-Steinebach theorem does not apply (either because L is not differentiable in a neighbourhood of λ, or because L is not strictly convex in a neighbourhood of λ); however, its conclusion remains true
(i.e. both conclusions of Theorem 1 hold); the case λ=0 and Lfr′(0+)=−∞ is included.
Note that λ is the only point in [0,λ+ε] to which Theorem 1 applies, since (a) every t∈[0,λ[ ∪ ]λ,λ+ε[ fulfils the case (ii) of Lemma 1b), hence the excluded case (ii) mentioned in Section 1, and (b) Lr′(λ+ε)=+∞, i.e. λ+ε fulfils the case (iv) of Lemma 1, hence the excluded case (i) mentioned in Section 1.
First, we extend f to a convex lower semi-continuous function on R by putting f(t)=+∞ for all t∈[0,λ+ε]. Put λ0=λ+ε.
Let (λi)i≥1 be a decreasing sequence in ]λ,λ+ε[ converging to λ.
Draw a line segment Di between (λi,f(λi) and (λi+1,f(λi+1)) for all i∈N, and draw a line segment D between and (0,0) and (λ,f(λ). Let Lf be the function whose graph coincides with the graph of f (resp. Di, D) on ]−∞,0[ ∪ ]λ0,+∞[
(resp. [λi+1,λi] for all i∈N, [0,λ]). Clearly, Lf is a convex function on R, continuous on its effective domain [0,λ+ε], affine on [0,λ] and [λi+1,λi] for all i∈N,
and fulfils Lf(0)=0.
The strict convexity of f ensures
for each i≥1 the existence of some ti∈ ]λi+1,λi[ such that
[TABLE]
hence
[TABLE]
and Lfl′(λ)<Lfr′(λ) when Lfr′(λ)>−∞.
Therefore, Lf is not differentiable at λ, and not differentiable at λi for all i∈N, but (21) and (22) show that
Lf fulfils the hypotheses of Theorem 1.
It remains to show that Lf is a generalized log-moment generating function such that Lf(t) is a limit for all t∈R. Let {zk:k≥1} be a countable set dense in the effective domain dom Lf∗ of Lf∗, and put
[TABLE]
Since Lf is lower semi-continuous, we have Lf=Lf∗∗ ([11], Theorem 23.5) i.e.
[TABLE]
hence
[TABLE]
and letting n→+∞,
[TABLE]
Since the set {zk:k≥1} is dense in dom Lf∗ and
Lf∗ is continuous on dom Lf∗, we get for each t∈R and for each z∈dom Lf∗,
[TABLE]
hence
[TABLE]
which together with (23) gives
[TABLE]
Since
[TABLE]
for all (t,n)∈R×N∖{0},
it follows from (24) that the generalized log-moment generating function associated with (μn,f,n−1) is a limit for all t∈R, and coincides with Lf.
Consequently, both conclusions of Theorem 1 hold, i.e. for every sequence (xn) converging to Lfr′(λ), and for every sequence (yn) fulfilling liminfnyn>Lfr′(λ), we have
[TABLE]