On the convex Poincar\'e inequality and weak transportation inequalities
Rados{\l}aw Adamczak, Micha{\l} Strzelecki

TL;DR
This paper establishes an equivalence between the convex Poincaré inequality and weak transportation inequalities with quadratic-linear cost for probability measures on R^n, extending previous results and introducing new concentration inequalities.
Contribution
It generalizes the equivalence between convex Poincare9 and weak transportation inequalities to higher dimensions and introduces modified logarithmic Sobolev inequalities for convex functions.
Findings
Proves the equivalence for R^n
Derives refined concentration inequalities for convex functions
Extends previous one-dimensional results
Abstract
We prove that for a probability measure on , the Poincar\'e inequality for convex functions is equivalent to the weak transportation inequality with a quadratic-linear cost. This generalizes recent results by Gozlan et al. and Feldheim et al., concerning probability measures on the real line. The proof relies on modified logarithmic Sobolev inequalities of Bobkov-Ledoux type for convex and concave functions, which are of independent interest. We also present refined concentration inequalities for general (not necessarily Lipschitz) convex functions, complementing recent results by Bobkov, Nayar and Tetali.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On the convex Poincaré inequality and weak transportation inequalities
Radosław Adamczak
Institute of Mathematics, University of Warsaw, Banacha 2, 02–097 Warsaw, Poland.
and
Michał Strzelecki
Institute of Mathematics, University of Warsaw, Banacha 2, 02–097 Warsaw, Poland.
(Date: Last changes : March 27, 2017.)
Abstract.
We prove that for a probability measure on , the Poincaré inequality for convex functions is equivalent to the weak transportation inequality with a quadratic-linear cost. This generalizes recent results by Gozlan et al. and Feldheim et al., concerning probability measures on the real line.
The proof relies on modified logarithmic Sobolev inequalities of Bobkov-Ledoux type for convex and concave functions, which are of independent interest.
We also present refined concentration inequalities for general (not necessarily Lipschitz) convex functions, complementing recent results by Bobkov, Nayar, and Tetali.
Key words and phrases:
Concentration of measure, convex functions, Poincaré inequality, weak transport-entropy inequalities.
2010 Mathematics Subject Classification:
Primary: 60E15. Secondary: 26B25, 26D10.
Research partially supported by the National Science Centre, Poland, grants no. 2015/18/E/ST1/00214 (R.A.) and 2015/19/N/ST1/00891 (M.St.).
1. Introduction
In the last thirty years a substantial body of research has been devoted to the interplay between various functional inequalities, transportation of measure theory, and the concentration of measure phenomenon, showing intimate connection between them. While most of the investigations have been carried out in the setting of general Lipschitz functions, concentration inequalities restricted to the class of convex Lipschitz functions have also been considered by many authors, starting from the seminal work by Talagrand in the 1990’s ([30, 31], see also [21, 24, 28, 29] and the monograph [22] for subsequent developments). A crucial feature of these results is that they are satisfied under much less restrictive assumptions concerning the regularity of the underlying probability measure when compared to inequalities valid for all Lipschitz functions. Even though the theory of concentration of measure for convex functions to some extent parallels the classical theory, there are some subtle differences related to the fact that convexity is not preserved under general contractions—even under the change of signs—which creates certain difficulties in the proofs and makes many well known arguments, which have been established in the classical context, invalid. As a consequence, the theory of concentration of measure for convex functions has not yet reached a satisfactory level of completeness. Nevertheless, several important results have been obtained in recent years, connecting dimension-free concentration inequalities for convex functions with the convex Poincaré inequality [19] and a new type of weak transportation cost inequalities [16, 17]. We will now briefly describe these developments, which will allow us to formulate our main result.
Let stand for the standard Euclidean norm on . Let be a Borel probability measure on and let be a random vector with law . We say that (equivalently ) satisfies the convex Poincaré inequality with constant if for all convex functions we have
[TABLE]
where by we mean the length of gradient at , defined as
[TABLE]
Note that this coincides with the length of the ‘true’ gradient provided is differentiable at . Also, it is enough to assume that (1.1) holds for convex Lipschitz functions, since an arbitrary convex function can be pointwise approximated by convex Lipschitz functions.
It follows from the results by Gozlan, Roberto, and Samson [19] that satisfies the convex Poincaré inequality if and only if there exists a constant such that for any , any convex set with , and any ,
[TABLE]
where denotes the unit Euclidean ball in and stands for the Minkowski addition.
It is not difficult to see that (1.3) is equivalent to the one-sided deviation inequality for convex -Lipschitz functions, i.e.
[TABLE]
for all , where are i.i.d. copies of , and denotes the median of the random variable , i.e. .
Thus the convex Poincaré inequality is equivalent to a dimension free deviation inequality for the upper tail of convex Lipschitz functions.
Let us now pass to the connections between the Poincaré inequality and transportation inequalities. Let be a measurable function with . Recall that the optimal transport cost between two probability measures and on , induced by is given by
[TABLE]
where the infimum is taken over all couplings between and , i.e. over all probability measures on such that , . Recall also that the relative entropy is defined as
[TABLE]
if is absolutely continuous with respect to and otherwise.
It has been proved in [9] that satisfies the Poincaré inequality (1.1) for all smooth functions if and only if there exist constants such that for all probability measures ,
[TABLE]
where
[TABLE]
Recently Gozlan, Roberto, Samson, Shu, and Tetali [17] formulated a similar characterization of the convex Poincaré inequality on the real line. In order to formulate their result we need to introduce the weak transport cost between probability measures and corresponding transportation inequalities as defined in [16, 17].
In what follows, by we denote the class of all probability measures on such that .
Definition 1.1**.**
Let and be probability measures on . Assume that . For a convex, lower semicontinuous function , such that define the weak transport cost between and as
[TABLE]
where the infimum is taken over all couplings between and and for , is the conditional measure defined ( almost surely) by .
Note that in the probabilistic notation one can write
[TABLE]
where the infimum is taken over all pairs of random vectors with values in , such that is distributed according to and according to .
Due to the asymmetry between and , one can now introduce three different inequalities related to the cost .
Definition 1.2**.**
Let and be a convex lower semicontinuous function with . We will say that satisfies the inequality
- •
if for every probability measure ,
[TABLE]
- •
if for every probability measure ,
[TABLE]
- •
if satisfies both and .
The definition of those inequalities in [16] differs formally from the one presented above (which is taken from [17]). It is not difficult to see that the definitions presented in both articles are equivalent up to universal constants—the version above is more convenient for our purposes.
The authors of [17] proved that a probability measure on the real line satisfies the convex Poincaré inequality for some constant if and only if it satisfies the transportation inequality for some . In a dual formulation (expressed in terms of infimum convolution inequalities), this result has been also obtained in [14].
Our main result is an extension of this equivalence to arbitrary dimension.
Theorem 1.3**.**
Let be a probability measure on . Then the following conditions are equivalent:
- (i)
There exists such that satisfies the convex Poincaré inequality (1.1).
- (ii)
There exist such that satisfies the transportation inequality .
Remark 1.4*.*
The implication (ii) (i) is standard, in this case . In our proof the constants in the implication (i) (ii) depend not only on but also on certain quantiles related to the measure (which are always finite but may be of the order of up to ). This is related to the inequality responsible for the lower tail of convex functions, which is usually more difficult to deal with than the upper tail. We suspect that this is an artefact of our proof and one should be able to obtain with depending only on . As for our argument does yield it with depending only on (see Corollary 4.3 below for details).
Remark 1.5*.*
Thanks to well known tensorization properties of the inequality , Theorem 1.3 implies that the convex Poincaré inequality is equivalent to improved two-level dimension free concentration inequality for convex functions (see Example 6.9 below for a precise formulation). In the class of Lipschitz functions inequalities of this type have been first obtained by Talagrand [30] in the case of the product exponential distribution (with an alternate proof, using infimum-convolution inequalities, by Maurey [24]). The fact that they are consequences of the Poincaré inequality for smooth functions was established by Bobkov and Ledoux [6]. By results due to Gozlan et al. [19] this can be regarded as a self-improvement of dimension-free concentration properties of Lipschitz functions. Our result shows that similar self-improvements are present also in the setting of convex concentration.
Remark 1.6*.*
In [10] Bobkov and Götze provide a simple characterization of measures on which satisfy the convex Poincaré inequality for some (and thus also the inequality ) in terms of the probability distribution function. A similar characterization for larger seems to be a non-trivial open problem.
The organization of the article is as follows. First, in Section 2, we present preliminary properties of measures satisfying the convex Poincaré inequality and weak transportation inequalities, to be used in the proofs. Section 3 contains our most important technical result, i.e. modified log-Sobolev inequalities for convex and concave functions, which in Section 4 are combined with the Hamilton-Jacobi equations giving the proof of Theorem 1.3.
Next, in Section 5 we briefly discuss operations preserving the convex Poincaré inequality, which may be used to provide new non-trivial examples of measures satisfying it.
In Section 6 we present refined concentration of measure inequalities, which are consequences of weak transportation inequalities. We consider there more general cost functions than the one corresponding to the convex Poincaré inequality and discuss applications both to the Lipschitz and non-Lipschitz setting.
Finally, in Section 7 we state a few open questions. The Appendix contains basic facts concerning Hamilton-Jacobi equations, which are used in the proof of Theorem 1.3.
2. Preliminaries on the convex Poincaré inequality and weak transportation inequalities
In this section we present basic concentration of measure properties implied by the convex Poincaré inequality and the dual formulations of weak transportation inequalities. They will be needed in the proof of our main result.
We begin with a simple reformulation of the convex Poincaré inequality.
Lemma 2.1**.**
Let be a random vector in satisfying the convex Poincaré inequality (1.1). Then for every convex function ,
[TABLE]
Proof.
Note that for every random variable , thanks to the fact that the median minimizes the mean absolute deviation, we have
[TABLE]
Thus
[TABLE]
and it is enough to set and apply (1.1). ∎
2.1. Concentration inequalities
Let us start with the already mentioned (see (1.4)) upper tail estimate for convex Lipschitz functions implied by the convex Poincaré inequality. The proposition below can be also obtained by abstract results from [19], but we would like to provide an alternative derivation based on moments (the possibility of such a proof was suggested in [19]). Our strategy mimics a well known approach from the general Lipschitz case (see e.g. Proposition 2.5. in [25]), however we have to deal with some small difficulties related to the fact that in the convex setting we cannot truncate the function as this operation does not preserve convexity.
Proposition 2.2**.**
Assume that is a random vector in , satisfying the convex Poincaré inequality (1.1). Then for any -Lipschitz convex function and any ,
[TABLE]
Proof of Proposition 2.2.
Consider the random variable , where is arbitrary such that , and let be an independent copy of . Since the function is convex,
[TABLE]
and so , which implies that is exponentially integrable. In particular for every Lipschitz function and all , .
Assume now that is convex. Then for all , applying Lemma 2.1 to the convex function (note that its median is zero and ), we obtain
[TABLE]
where we used Hölder’s inequality with exponents , . If we additionally assume that is Lipschitz, so that , we get
[TABLE]
which via Chebyshev’s inequality in implies
[TABLE]
for . Now, if the Lipschitz constant of equals one, the above inequality yields for ,
[TABLE]
Remark 2.3*.*
Another possible approach is based on the Laplace transform: assume without loss of generality that and denote for . Since the function is convex, the Poincaré inequality yields
[TABLE]
The idea would be now to regroup the expressions appearing in the above inequality, repeat the procedure (with instead of ), and—after a simple limit argument—obtain a bound on . After that we could use Markov’s inequality and optimize in to obtain an estimate of the upper tail of . However a delicate issue emerges: we have to a priori know that (for reasonable choices of the parameter ) is integrable (in the setting of smooth functions one overcomes this problem simply by truncating , for convex functions one would need e.g. to repeat the beginning of the proof of Proposition 2.2); cf. the remark following Theorem 6.8 in [19].
We do not know if the convex Poincaré inequality implies similar tail estimates—which depend only on and the Lipschitz constant of the function—for the lower tail of convex Lipschitz functions, i.e. for , (cf. Question 7.3 below).
Nonetheless, we can easily get estimates in terms of and certain quantiles of . They will be crucial in the proof of the implication
[TABLE]
Lemma 2.4**.**
Let be a random vector in satisfying the convex Poincaré inequality (1.1) and let be any number such that . Then for every convex and for any ,
[TABLE]
Proof.
By Proposition 2.2 (note that the function is convex and -Lipschitz),
[TABLE]
Let be a convex function. Without loss of generality we may assume . We have ,
[TABLE]
Thus there exists such that , , and . Define
[TABLE]
where is any subgradient of at , so that for all . Taking with we see that , and thus we have
[TABLE]
If now , we can conclude from (2.3) that
[TABLE]
which ends the proof. ∎
2.2. Infimum convolution. Dual formulation of transportation inequalities
We will rely on the following lemma proved in [17] (and in a slightly different version also in [16]). The proof in [17] is presented for the real line, but it is not difficult to see that it generalizes to arbitrary dimension.
Lemma 2.5**.**
Let be a convex cost function, , . For all functions bounded from below, , and set
[TABLE]
Then
- (i)
* satisfies if and only if for all convex , bounded from below,*
[TABLE]
- (ii)
* satisfies if and only if for all convex , bounded from below,*
[TABLE]
- (iii)
if satisfies , then for all convex , bounded from below,
[TABLE]
holds with . Conversely, if satisfies (2.6) for some , then it satisfies with .
Moreover, the inequality (2.4) (resp. (2.5)) for all convex, Lipschitz functions bounded from below is a sufficient condition for (resp. ).
The inequality (2.6) was introduced by Maurey in [24] and the relation with transportation cost inequalities was first observed in [7].
3. From convex Poincaré to modified log-Sobolev inequalities
for convex and concave functions
In this section we present modified log-Sobolev inequalities for convex and concave functions which are implied by the convex Poincaré inequality. Our approach builds heavily on the arguments introduced by Bobkov and Ledoux in [6] for arbitrary Lipschitz functions, however some non-trivial modifications will be necessary in order to handle the difficulties imposed by the restriction of the Poincaré inequality to convex functions.
In what follows for a nonnegative random variable , we define its entropy as
[TABLE]
if and otherwise. We refer to e.g [5, 22] for basic properties of entropy and log-Sobolev inequalities.
Throughout this section we assume that is a probability measure on satisfying the convex Poincaré inequality (1.1) and that is a random vector with law , which will not be explicitly stated in all the theorems.
3.1. Modified log-Sobolev inequalities for convex functions
Theorem 3.1**.**
Let be convex with for all . Then
[TABLE]
where
[TABLE]
Our constants are slightly worse than in [6], basically because we need to work with the median rather than the mean. However the argument (which works also in the classical case) seems to slightly simplify the technicalities of [6]. The proof relies on two propositions.
Proposition 3.2**.**
Let be convex with and for all . Then
[TABLE]
where C_{1}=C_{1}(c,\lambda)=\bigl{(}\sqrt{\lambda/2}-c/2\bigr{)}^{-2}.
Proof.
For we define and
[TABLE]
One easily checks that , , and is convex nondecreasing.
Denote and (where ). The function is convex, moreover . Hence, by Lemma 2.1,
[TABLE]
Note that (by Proposition 2.2 and since ). Thus and the assertion follows. ∎
Proposition 3.3**.**
Let be either convex or concave, with and for all . Then
[TABLE]
where . Consequently,
[TABLE]
Proof.
If vanishes with probability one, there is nothing to prove. Otherwise, denote by the expectation with respect to the probability measure with density relative to . By Jensen’s inequality,
[TABLE]
Thus, using the trivial inequality , we conclude that
[TABLE]
But since
[TABLE]
we can bound by . This yields the assertion of the proposition. ∎
Proof of Theorem 3.1.
Without loss of generality assume . Denote , . By the formula and the convexity of ,
[TABLE]
(note that for this argument to work we do not need the expectation of to vanish). Thus Propositions 3.2 and 3.3 imply the assertion of the theorem. ∎
3.2. Modified log-Sobolev inequalities for concave functions
Theorem 3.4**.**
Let be convex with for all . Assume that satisfies . Then
[TABLE]
where is a constant depending only on .
Remark 3.5*.*
If we denote by the coordinates of , then by the Poincaré inequality we have
[TABLE]
and hence, by the Chebyshev inequality, satisfies . Thus in fixed dimension and for say , the constant in Theorem 3.4 can be bounded uniformly over all probability measures satisfying the convex Poincaré inequality with constant .
Proof of Theorem 3.4.
We start as in the proof of Theorem 3.1. Denote (this is a concave function). Without loss of generality assume . Denote , . By the convexity of ,
[TABLE]
We have
[TABLE]
By Proposition 3.3, , so it remains to estimate .
Integration by parts and Lemma 2.4 yield
[TABLE]
if only . Similarly (using Lemma 2.4 in its full strength),
[TABLE]
for some . Thus, by Proposition 3.3,
[TABLE]
This, together with (3.2) and (3.3), ends the proof:
[TABLE]
4. Proof of the main result
We will now present the proof of Theorem 1.3. As already mentioned, the implication (ii) (i) is standard, we provide a sketch of its proof just for the sake of completeness. The proof of the implication (i) (ii) follows the arguments introduced first in [9] and based on the analysis of the Hamilton-Jacobi equations. A crucial element of the proof will be the modified log-Sobolev inequalities obtained in Section 3.
Lemma 4.1**.**
Let be a random vector in . Assume that there exist and such that
[TABLE]
and the inequality
[TABLE]
holds for every convex (respectively: concave) -Lipschitz function . Then, for every convex Lipschitz function bounded from below,
[TABLE]
where , , is the infimum convolution operator with the cost function
[TABLE]
Remark 4.2*.*
The condition (4.1) is introduced to exclude heavy-tailed measures for which the only exponentially integrable convex functions are constants. Note that in this case the inequality (4.2) is trivially satisfied, while the transportation inequality cannot hold (as it implies the existence of exponential moments).
If we recall the dual formulations of the weak transport-entropy inequalities and (see Lemma 2.5), the definition of from (1.8), and the results of the preceding section (namely, Theorems 3.1 and 3.4), we immediately obtain the following corollaries.
Corollary 4.3**.**
Let be a random vector in satisfying the convex Poincaré inequality (1.1). Then, for any , the law of satisfies the inequality with
[TABLE]
Corollary 4.4**.**
Let be a random vector in satisfying the convex Poincaré inequality (1.1) and let be any number such that . Then, for any , the law of satisfies the inequality for some constant depending only on , , and .
Proof of Lemma 4.1.
Suppose that the log-Sobolev inequality (4.2) holds for all convex and -Lipschitz functions. We first present a perturbation argument which allows us to work with random vectors with an absolutely continuous law. We then shall follow the approach of [17, Proof of Theorem 1.5].
Let be a Gaussian random vector in , independent of , with the covariance matrix being a sufficiently small multiple of identity, so that it satisfies the usual log-Sobolev inequality with constant ,
[TABLE]
for all Lipschitz functions (see e.g. Theorem 5.1. in [22] for an equivalent formulation).
Then, by the tensorization property of entropy (see e.g. Proposition 5.6. in [22]), the random vector on satisfies the modified log-Sobolev inequality
[TABLE]
for all convex functions which are -Lipschitz with respect to the first coordinate (here and denote partial lengths of gradients with respect to the first and second variable, with the other variable fixed).
Let be a convex -Lipschitz function and consider . Applying the inequality (4.4) to the function defined by the formula for (which is -Lipschitz with respect to the first variable), we see that the random vector satisfies the modified log-Sobolev inequality
[TABLE]
where . Note that the law of is absolutely continuous with respect to the Lebesgue measure on , and so almost surely is a differentiability point of and coincides with the Euclidean length of the ‘true’ gradient .
Moreover, (4.5) can be rewritten in the form
[TABLE]
where
[TABLE]
is the Legendre transform of .
If is convex, Lipschitz (with arbitrary Lipschitz constant) and bounded from below, then is well defined, convex (as an infimum convolution of two convex functions), and -Lipschitz for (since and the function is -Lipschitz for ).
Moreover, the function is Lipschitz on and satisfies the Hamilton-Jacobi equation
[TABLE]
(see Proposition A.1 in Appendix A). Set
[TABLE]
(Note that since is -Lipschitz.) Using the integrability properties of (and as a consequence of ), together with the Lipschitz property of it is not difficult to see that is locally Lipschitz and for Lebesgue almost all ,
[TABLE]
where we used (4.6), the definition of , and the fact that is -Lipschitz. Thus
[TABLE]
or, in other words,
[TABLE]
It is easy to see that by taking we arrive at the assertion of the lemma (recall that and are Lipschitz and ).
Suppose now that the log-Sobolev inequality (4.2) holds for all concave and -Lipschitz functions. As before, we pass to the random vector which has an absolutely continuous distribution. Let be convex and bounded from below. Then the function is concave and -Lipschitz. The same calculation as above yields
[TABLE]
or equivalently
[TABLE]
We stress that now, in order to prove the Hamilton-Jacobi equations via Proposition A.1, we need to use the -Lipschitz property of , since in general is not bounded from below.
Since
[TABLE]
(to verify the inequality take ), a limit argument yields the assertion. ∎
We are now ready for the proof of our main result.
Proof of Theorem 1.3.
The implication (i)(ii) follows immediately from Corollaries 4.3 and 4.4, and the definition of . To obtain the reverse implication one can use a standard Taylor expansion argument. Assume that holds. Let be convex, Lipschitz, and bounded from below. For denote
[TABLE]
where is any subgradient of at , so that on . Taking with we see that .
For sufficiently small we have for all , and hence
[TABLE]
(recall that ). We now substitute into the dual formulation (2.6) and use the above estimate. An inspection of the Taylor expansions up to order yields
[TABLE]
This ends the proof. ∎
5. Examples of measures satisfying the convex Poincaré inequality
We will now discuss several tools which allow to construct measures satisfying the convex Poincaré inequality. To shorten the notation we will denote by and respectively the mean and variance of seen as a random variable on equipped with probability measure .
Let us start with the well known tensorization property of variance (see e.g. [5, Proposition 1.4.1]), which asserts that whenever are probability measures on , , then the product measure on , satisfies the inequality
[TABLE]
for every function , where denotes the variance of treated as a function on , with the other coordinates fixed.
This immediately implies the tensorization property for the convex Poincaré inequality, namely if () is a probability measure on , satisfying the convex Poincaré inequality with constant , then the product measure on satisfies
[TABLE]
for every convex function , where denotes the ‘partial length of gradient’ along . If the measures are absolutely continuous with respect to the Lebesgue measure, then by Rademacher’s theorem locally Lipschitz functions are almost everywhere differentiable, in particular the right-hand side of the above inequality coincides with and so we obtain that satisfies the convex Poincaré inequality with constant . The situation is more delicate for measures which are not absolutely continuous, however thanks to results by Gozlan, Roberto and Samson [19], we can obtain the following simple proposition.
Proposition 5.1**.**
Assume that are probability measures on , , satisfying the convex Poincaré inequality with constant . Then the measure on satisfies the convex Poincaré inequality with constant for some universal constant
Proof.
We provide only a sketch of the proof, leaving some computational details to the Reader. Denote and consider an arbitrary convex smooth 1-Lipschitz function on , . By (5.1) we have . Using an analogous argument as in the proof of Proposition 2.2 (for , to remain in the smooth setting) we arrive at
[TABLE]
for all 1-Lipschitz smooth convex functions. We can extend this inequality to arbitrary 1-Lipschitz convex function (approximating them with 1-Lipschitz smooth convex functions, e.g. by convolving them with Gaussian densities, see [28, p. 429]), so in particular we get that for any convex set , with , and all ,
[TABLE]
where is the unit Euclidean ball in . Recall the notation
[TABLE]
By [19, Theorem 6.7], the dimension free subexponential concentration for convex sets of the form (5.2) implies that satisfies the Poincaré inequality
[TABLE]
for all convex functions , where
[TABLE]
where is the Gaussian tail function. Using the estimate and performing some elementary calculations, we arrive at the assertion of the proposition. ∎
Remark 5.2*.*
The above argument shows that if satisfies the Poincaré inequality (1.1) then it also satisfies the formally stronger inequality (5.3) with . We remark that in the category of all Lipschitz functions it is known that the Poincaré inequalities with the length of gradients and are equivalent and the involved constants do not change (cf. [19, Remark 1.1]).
Tensorization allows in particular to pass from one-dimensional measures satisfying the convex Poincaré inequality (characterized in [10]) to product measures in higher dimensions. Another standard tool for producing new examples is perturbation: if satisfies the convex Poincaré inequality with constant and is a measure with density with respect to , then satisfies the convex Poincaré inequality with constant . For the proof see e.g. [5, Chapter 3.4] (the proof therein is written in the context of Markov processes and Dirichlet forms but it is based only on the elementary observation that and works in exactly the same way in the convex setting).
Perturbation and tensorization are tools that appeared for the first time in the ‘classical’ theory of Poincaré and log-Sobolev inequalities for smooth (or locally Lipschitz) functions. The next proposition does not have a counterpart in the classical setting and significantly extends the set of tools for creating new examples. Namely, we will show that the convex Poincaré inequality passes to mixtures of measures. Note that this cannot be the case for the classical Poincaré inequality since it clearly cannot hold for measures with disconnected support. We note however that the preservation of the Poincaré and log-Sobolev inequalities by mixtures of measures with overlapping supports has been investigated by Chafaï and Malrieu in [11]. In particular, the Proposition 5.3 below has been inspired by calculation in Section 4.1 therein.
Let stand for the usual Kantorovich transport cost between and (defined by taking in (1.5)), in other words the square of the Kantorovich-Wasserstein distance .
Proposition 5.3**.**
Let , be probability measures on which satisfy the convex Poincaré inequality (1.1) with constants and respectively. Then the measure , , satisfies the convex Poincaré inequality (1.1) with constant
[TABLE]
Proof.
If is a convex function, then
[TABLE]
and it suffices to estimate the last term.
Let and be random vectors in with laws and respectively. By convexity of ,
[TABLE]
Thus,
[TABLE]
Taking the infimum over all realizations of and implies the assertion. ∎
6. Refined concentration of measure derived from infimum convolution inequalities
In this section we explain what concentration inequalities for convex functions can be obtained from general infimum convolution inequalities of the form (2.6). While some parts of our derivation are well known and are included only for the sake of completeness, we also provide new inequalities valid beyond the setting of Lipschitz functions. Their proofs are elementary but to our best knowledge they have not been noted in the literature before.
Throughout this section is a convex function. We also assume the following conditions:
- •
for all ,
- •
if and only if (in particular, by convexity, ).
We remark that at the cost of some technical work one can obtain the results we present below for more general cost functions (e.g. taking the value or not satisfying the symmetry condition). We restrict to the smaller class to simplify the presentation.
In what follows, for a function , bounded from below, we set
[TABLE]
We also denote
[TABLE]
6.1. Enlargements of sets and concentration for Lipschitz functions
Let us start with the classical description of concentration of measure in terms of enlargements of sets. The following proposition goes back to [24].
Proposition 6.1**.**
Assume that is a probability measure on , satisfying
[TABLE]
for all convex functions , bounded from below. Then for all convex subsets and , we have
[TABLE]
Proof.
Consider and note that if and only if there exists such that . Applying the inequality (6.1) to (which can be justified by monotone approximation), we obtain
[TABLE]
To formulate corollaries to the above proposition we need to introduce new notation, which at first may seem rather abstract. However, as the examples presented in the subsequent parts of this section will show, it will prove useful in providing a uniform framework for concentration inequalities, especially in the non-Lipschitz case.
Definition 6.2**.**
Define the norm on , as the Orlicz norm corresponding to the function , i.e.
[TABLE]
Define also the norm on as the dual to , i.e.
[TABLE]
The norm is equivalent (up to universal constants) to the Orlicz norm related to the function , explicitly given by
[TABLE]
It was observed by Gluskin and Kwapień in [15] that norms of this type play an important role in moment estimates for sums of independent random variables. Recently it has been noted [3, 1] that they also appear in moment estimates for smooth functions of random vectors satisfying modified log-Sobolev inequalities. Since in the context of transportation or infimum convolution inequalities one starts from the function and not from (which is the case in the corresponding log-Sobolev setting) it is more convenient to work with rather than with the equivalent norm used in [3, 1].
In what follows we will need the following simple inequality which follows from convexity of and the assumption . For , , and ,
[TABLE]
The following corollary to Proposition 6.1 is again based on by now standard arguments, written however in the language of the norms .
Corollary 6.3**.**
Let be a random vector with law , satisfying (6.1) for all convex functions bounded from below. Then for any smooth convex Lipschitz function and ,
[TABLE]
Remark 6.4*.*
It is easy to see that if the inequality (6.3) holds for all smooth convex Lipschitz functions, then one can apply it to arbitrary convex Lipschitz function, replacing by the Lipschitz constant of with respect to the norm . To verify this it is enough to consider convolutions of with a sequence of Gaussian densities converging to Dirac’s mass at zero—they are smooth, have the same Lipschitz constant as and converge to uniformly (see e.g. [28, p. 429]).
Proof of Corollary 6.3.
Let , so that . Then by convexity, for any ,
[TABLE]
Thus
[TABLE]
where in the second inequality we used Proposition 6.1.
Let now . Similarly as above, we obtain
[TABLE]
which shows that
[TABLE]
Combining the last inequality with (6.5) proves the corollary. ∎
6.2. Concentration inequalities for general convex functions
We are now ready to state the main result of this section, contained in the following theorem, dealing with general (not necessarily Lipschitz) convex functions. In its formulation we adopt the convention . The proof of the theorem as well as of its corollary is postponed to Section 6.3
We would like to emphasize, that in the theorem we assume only (6.3), which is streactly weaker than the infimum-convolution inequality (6.1).
Theorem 6.5**.**
Let be a random vector satisfying (6.3) for all smooth convex Lipschitz functions . Then for any smooth convex function , the following properties hold.
- (i)
For any ,
[TABLE]
- (ii)
Let , and let satisfy . Then
[TABLE]
In particular for ,
[TABLE]
- (iii)
For all ,
[TABLE]
Remark 6.6*.*
As will become clear in the proof, the part (i) of the above theorem holds in fact under one-sided concentration, i.e. it is enough to assume that
[TABLE]
Let us now illustrate the above theorem with a few concrete examples and a corollary. In particular we will show what the norms look like for different choices of the cost function .
Example 6.7**.**
If for some and , then and (6.3) is equivalent to
[TABLE]
for all 1-Lipschitz convex functions (in particular for we get the subgaussian concentration). The first part of Theorem 6.5 gives then the following inequality for all (not necessarily Lipschitz) convex functions and ,
[TABLE]
Thus by the -Chebyshev inequality, with we obtain for ,
[TABLE]
(the additional factor on the right-hand side is introduced artificially to encompass all , also those for which ; note that in this case the right-hand side exceeds one). We remark that similar self-normalized inequalities are known e.g. in the theory of empirical processes (see [12]).
The lower tail inequalities gives
[TABLE]
Moreover, using the full strength of part (ii) of Theorem 6.5, one can replace by , where is the quantile of . Thus no integrability conditions on the gradient are in fact required.
Remark 6.8*.*
Let us note that inequalities similar to (6.11) were previously known with the quantity instead of the quantile or (see [28] or [23, Chapter 3.3]. Very recently, Paouris and Valettas [26] have proved that the standard Gaussian vector in satisfies a similar inequality (for ) with in place of . Their proof uses in a crucial way isoperimetric properties of Gaussian measures. The version with follows simply by an application of the (1,1)-Poincaré inequality for the Gaussian measure, i.e. (see e.g. [27, 25]). In fact the proof in [26] gives also inequalities in terms of quantiles of . We do not know if they are comparable to our estimates (specialized to the standard Gaussian measure) in terms of quantiles of .
Note also that (6.9) for is a consequence of the convex Poincaré inequality (however we do not know if (1.1) implies (6.9) with depending only on and not on the dimension , see Question 7.3 below).
Example 6.9**.**
Let us now consider a measure on satisfying the convex Poincaré inequality with constant . Then, by Theorem 3.1 it satisfies the convex Bobkov-Ledoux inequality (3.1) with constants and depending only on . By the classical Herbst argument it follows (see e.g. [6, 2]) that for each , if is an -dimensional random vector with law , then for any smooth convex function and any ,
[TABLE]
where for , denotes the partial gradient with respect to .
Moreover, by the Poincaré inequality
[TABLE]
which at the cost of changing the constant allows to replace the mean by the median in the above inequality. Thus we obtain that for some constant and ,
[TABLE]
It is easy to see that up to universal constants is equivalent to , where
[TABLE]
More precisely
[TABLE]
Thus, the first part of Theorem 6.5 together with Remark 6.6 gives for arbitrary smooth convex function on , the inequality
[TABLE]
for , where depends only on . By Chebyshev’s inequality this implies that
[TABLE]
for (note that contrary to (6.10) this time cannot be removed from the denominator).
As for the lower tail, by Theorem 1.3, Remark 1.4, Lemma 2.5 and tensorization properties of infimum convolution inequalities (see Lemma 5 in [24]) we obtain that satisfies (6.1) and thus also (6.3) with , where depends only on and the dimension . Thus, by the second part of Theorem 6.5,
[TABLE]
or equivalently (up to constants depending only on ),
[TABLE]
We stress that all the above inequalities are dimension-free in the sense that the constants do not depend on the number but just on the initial dimension (cf. Remark 1.5).
Example 6.10**.**
Finally, we remark that general cost functions lead to other concentration profiles, which have been studied in the literature. One can for instance consider products of measures on , satisfying (6.1) with
[TABLE]
for (such measures are characterized thanks to results in [17]). If we denote for , and let be the Hölder conjugate of , then such costs correspond for to norms of the form (the case has been discussed above), while for to
[TABLE]
where is the non-increasing rearrangement of the sequence .
We will now present a corollary to Theorem 6.5, providing concentration inequalities for non-Lipschitz convex functions, in the spirit of recent results due to Bobkov, Nayar, and Tetali [8].
Corollary 6.11**.**
Under the assumptions of Theorem 6.5 for all convex functions ,
[TABLE]
Moreover, for any ,
[TABLE]
Let us note that inequalities of the form (6.12) have been obtained in [1] for all smooth functions of random vectors satisfying modified log-Sobolev inequalities (assumed to hold for all smooth functions). Therein, the function had to satisfy some appropriate growth condition.
Example 6.12**.**
In particular for , the above corollary gives
[TABLE]
By substituting and adjusting the constant we obtain
[TABLE]
where is positive and depends only on . The factor 2 in the above inequality is introduced for notational simplicity to allow the whole range of in the infimum (note that for large we have and we cannot apply Corollary 6.11, on the other hand the above inequality becomes then trivial, as the right-hand side exceeds one).
Recall also the second part of Theorem 6.5 which for gives in this case
[TABLE]
where and again depends only on .
The above inequalities should be compared with a recent result in [8], which asserts that for some constant positive depending only on ,
[TABLE]
where is an independent copy of .
It is not difficult to see that in the regime of for which the above inequalities are of interest, i.e. the right-hand sides are small, (6.13) gives estimates on the upper tail which (up to numerical constants) are comparable to those implied by (6.15), whereas for the lower tail, the inequality (6.14) improves over (6.15).
Example 6.13**.**
Consider now , which we have already discussed in Example 6.9. From Corollary 6.11 we get
[TABLE]
By substituting and using the union bound we obtain
[TABLE]
with depending only on . As in the preceding example, the factor 2 is introduced to allow for all positive values of .
Remark 6.14*.*
Let us note that another way of obtaining estimates on the upper tail of non-Lipschitz functions under the convex Poincaré inequality is to use the estimates (2.1) and (2.2). By approximating arbitrary convex functions with Lipschitz ones we can easily see that they hold in fact for all convex functions. Thus, if one controls the moments of , one can obtain tail estimates beyond the Lipschitz case. Such inequalities are however different than those of the above example as they are of exponential type and not of mixed exponential or Gaussian type. On the other hand, the weak transportation inequality with cost arises usually as a consequence of tensorization, so in order to apply it we need some additional product structure of the measure.
6.3. Proofs of Theorem 6.5 and Corollary 6.11
Proof of Theorem 6.5.
Let us start with (i), the proof of which is quite similar to the proof of Corollary 6.3. Let us again define . Using (6.2) and (6.4), we can write for ,
[TABLE]
Hence for ,
[TABLE]
where we used the fact that the function is convex, 1-Lipschitz with respect to and , together with Corollary 6.3 and Remark 6.4. We can now integrate by parts and get
[TABLE]
(the integrand is pointwise non-increasing with respect to , as the computation of the derivative with respect to reveals), which proves the first part of the theorem.
Let us now pass to the second part. Assume without loss of generality that . Consider the set . By the definition of , we have . Let be defined as
[TABLE]
Then is convex, moreover by convexity of we have pointwise and on . By the definition of the set and inequality (6.2), for any all linear functionals , , are -Lipschitz with respect to and therefore so is . By Corollary 6.3 and Remark 6.4 this implies that for any ,
[TABLE]
We also have . Therefore, the above inequality applied with gives
[TABLE]
which by another application of (6.16) implies
[TABLE]
This proves the first inequality of part (ii).
The second inequality of part (ii) follows from the first one by specializing to , and some elementary calculations.
As for part (iii), using again (6.2) and (6.7), we get for
[TABLE]
Now, again by integration by parts,
[TABLE]
which ends the proof. ∎
Proof of Corollary 6.11.
To prove the first inequality it is enough to note that if and , then
[TABLE]
where the last inequality follows from (6.6). The assertion follows thus from Chebyshev’s inequality: .
As for the second inequality, we apply the first one with and combine it with the estimate (6.7). ∎
7. Further questions
Let us conclude with some open questions, which seem natural in view of our results.
As already mentioned in the introduction, in our proof of the implication
[TABLE]
the constants do not depend just on , but also on certain quantiles of the measure . In fact, the issue comes from the inequality , since the constants in do depend only on (see Corollary 4.3). This gives rise to our first question.
Question 7.1**.**
Does the Poincaré inequality with constant imply the weak transportation inequality with constants depending only on ?
The inspection of our proof shows that in order to answer the above question in the affirmative, it is enough to remove the restriction on in Lemma 2.4. An improved version of this lemma, valid for all would follow by part (ii) of Theorem 6.5 provided that one can show that the convex Poincaré inequality with constant implies subexponential concentration for convex 1-Lipschitz functions, with constants depending only on . The problem lies in the lower-tail (as the upper one is handled by Proposition 2.2). More precisely, we have the following result.
Theorem 7.2**.**
Assume that is a probability measure on , satisfying the convex Poincaré inequality (1.1) with constant and is a positive constant, such that for all -Lipschitz convex functions and all ,
[TABLE]
Then satisfies the inequality with depending only on and .
This motivates the following question, which is clearly of interest also in its own right.
Question 7.3**.**
Does the convex Poincaré inequality (1.1) with constant imply subexponential estimates for the lower-tail of convex 1-Lipschitz functions, with constants depending only on ? Specifically, is it true that whenever is a probability measure on satisfying (1.1), then for every convex -Lipschitz function ,
[TABLE]
where the constant depends only on ?
The inequality provided by Lemma 2.4 introduces an additional dependence on , which carries over to the dependence of constants in Theorem 1.3. Let us point out that all the proofs of lower-tail estimates based on the Poincaré inequality and available for the category of all smooth functions, which we have been able to find in the literature, seem to break down in the convex setting (see e.g. the arguments in [20, 4, 19]).
Appendix A Facts related to Hamilton-Jacobi equations
We will now present some basic properties of Hamilton-Jacobi equations related to infimum convolution operators with the cost , where is given by (4.3), which have been exploited in the proof of Lemma 4.1. We remark that all the facts we will rely on are quite standard, however in the literature they are usually considered under slightly different sets of assumptions, which makes it difficult to find an off the shelf result applicable to our situation. We will briefly indicate how the reasonings from [13, Chapter 3] can be modified to yield the properties we need. Alternatively, as in [17], one could rely on modification of the results from [18], where the theory of Hamilton-Jacobi equations is extended to the setting of metric spaces.
Proposition A.1**.**
Let be positive constants and let be defined by (4.3). Assume that is either bounded from below or -Lipschitz and let be given by , where
[TABLE]
Then the following conditions hold.
- (a)
For every and every , .
- (b)
The function is Lipschitz on ,
- (c)
At every point of differentiability of , one has
[TABLE]
where is the Legendre transform of , given explicitly by the formula
[TABLE]
Sketch of proof.
Let us note that if is bounded from below or -Lipschitz, then is well defined.
Ad (a). To show the semigroup property one can repeat the argument from the proof of [13, Chapter 3.3.2, Lemma 1], however in our setting one needs to work with infima rather then minima.
Ad (b). For fixed , is -Lipschitz as the function of , as an infimum of -Lipschitz functions. Indeed for each , the function is -Lipschitz. As for the Lipschitz property with respect to , the argument in the proof of [13, Chapter 3.3.2, Lemma 2] shows that if is -Lipschitz, then for any ,
[TABLE]
where . Now the Lipschitz condition with respect to (for general , which may not be -Lipschitz) follows from the semigroup property and the fact that is an -Lipschitz function of .
Ad (c). Using again the fact that is -Lipschitz, it is enough to consider the case when so is . One can then repeat the proof of [13, Chapter 3.3.2, Theorem 5], provided that one can prove that the infimum in the definition of is in fact achieved. To this end, it is enough to note that whenever we have, denoting ,
[TABLE]
where the inequality holds by the Lipschitz property of and the last equality follows from the definition of (and the fact that lies on the interval with endpoints and ). Thus and the existence of the minimizer follows from compactness and continuity of and . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Radosław Adamczak, Witold Bednorz, and Paweł Wolff, Moment estimates implied by modified log-Sobolev inequalities , to appear in ESAIM: Probability and Statistics.
- 2[2] Radosław Adamczak and Michał Strzelecki, Modified log-Sobolev inequalities for convex functions on the real line. Sufficient conditions , Studia Math. 230 (2015), no. 1, 59–93. MR 3456588
- 3[3] Radosław Adamczak and Paweł Wolff, Concentration inequalities for non-Lipschitz functions with bounded derivatives of higher order , Probab. Theory Related Fields 162 (2015), no. 3-4, 531–586. MR 3383337
- 4[4] S. Aida and D. Stroock, Moment estimates derived from Poincaré and logarithmic Sobolev inequalities , Math. Res. Lett. 1 (1994), no. 1, 75–86. MR 1258492
- 5[5] Cécile Ané, Sébastien Blachère, Djalil Chafaï, Pierre Fougères, Ivan Gentil, Florent Malrieu, Cyril Roberto, and Grégory Scheffer, Sur les inégalités de Sobolev logarithmiques , Panoramas et Synthèses [Panoramas and Syntheses], vol. 10, Société Mathématique de France, Paris, 2000, With a preface by Dominique Bakry and Michel Ledoux. MR 1845806
- 6[6] S. Bobkov and M. Ledoux, Poincaré’s inequalities and Talagrand’s concentration phenomenon for the exponential distribution , Probab. Theory Related Fields 107 (1997), no. 3, 383–400. MR 1440138
- 7[7] S. G. Bobkov and F. Götze, Exponential integrability and transportation cost related to logarithmic Sobolev inequalities , J. Funct. Anal. 163 (1999), no. 1, 1–28. MR 1682772
- 8[8] Sergey Bobkov, Piotr Nayar, and Prasad Tetali, Concentration Properties of Restricted Measures with Applications to Non-Lipschitz Functions , To appear in GAFA Seminar Notes (2015), ar Xiv:1506.06174 .
