High-dimensional limit theorems for random vectors in $\ell_p^n$-balls. II
Zakhar Kabluchko, Joscha Prochno, Christoph Thaele

TL;DR
This paper establishes central limit, moderate deviations, and large deviations theorems for the q-norms of high-dimensional random vectors in p^n-balls, extending previous work with new applications to projections.
Contribution
It introduces a unified framework for limit theorems for p^n-ball vectors under general distributions, including new applications to projections.
Findings
Proved a central limit theorem for p^n-ball vectors.
Established moderate deviations principles.
Derived large deviations results.
Abstract
In this article we prove three fundamental types of limit theorems for the -norm of random vectors chosen at random in an -ball in high dimensions. We obtain a central limit theorem, a moderate deviations as well as a large deviations principle when the underlying distribution of the random vectors belongs to a general class introduced by Barthe, Gu\'edon, Mendelson, and Naor. It includes the normalized volume and the cone probability measure as well as projections of these measures as special cases. Two new applications to random and non-random projections of -balls to lower-dimensional subspaces are discussed as well. The text is a continuation of [Kabluchko, Prochno, Th\"ale: High-dimensional limit theorems for random vectors in -balls, Commun. Contemp. Math. (2019)].
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and statistical mechanics · Geometry and complex manifolds · Point processes and geometric inequalities
High-dimensional limit theorems
for random vectors in -balls. II
Zakhar Kabluchko
Institut für Mathematische Stochastik, Westfälische Wilhelms-Universität Münster, Germany
,
Joscha Prochno
Institut für Mathematik & Wissenschaftliches Rechnen, Karl-Franzens-Universität Graz, Austria
and
Christoph Thäle
Faculty of Mathematics, Ruhr University Bochum, Germany
Abstract.
In this article we prove three fundamental types of limit theorems for the -norm of random vectors chosen at random in an -ball in high dimensions. We obtain a central limit theorem, a moderate deviations as well as a large deviations principle when the underlying distribution of the random vectors belongs to a general class introduced by Barthe, Guédon, Mendelson, and Naor. It includes the normalized volume and the cone probability measure as well as projections of these measures as special cases. Two new applications to random and non-random projections of -balls to lower-dimensional subspaces are discussed as well. The text is a continuation of [Kabluchko, Prochno, Thäle: High-dimensional limit theorems for random vectors in -balls, Commun. Contemp. Math. (2019)].
Key words and phrases:
Asymptotic geometric analysis, central limit theorem, convex bodies, -balls, large deviations principle, moderate deviations principle, stochastic geometry
2010 Mathematics Subject Classification:
Primary: 60F10, 52A23 Secondary: 60D05, 46B09
1. Introduction and main results
The study of high-dimensional geometric structures and particularly of convex bodies has received considerable attention in the last decade. In parts, this was triggered by modern applications in high-dimensional statistics, machine learning, and numerical analysis. Many of the deep discoveries are of a probabilistic flavor or have been obtained by means of novel and powerful probabilistic methods. It therefore comes as no surprise that (central) limit theorems have been obtained for various quantities that appear in high-dimensional stochastic geometry or the asymptotic theory of convex bodies. Probably the first high-dimensional central limit theorem is known as the Poincaré-Maxwell-Borel Lemma (see, e.g., [9, 23]). It shows that the distribution of the first coordinates of a point chosen uniformly at random from the -dimensional Euclidean ball or sphere converges to a -dimensional Gaussian distribution, as the dimension of the ambient space tends to infinity. The most prominent result of the past years is arguably Klartag’s central limit theorem for isotropic convex bodies [15], showing that most -dimensional marginals of random points chosen uniformly at random from a convex body are approximately Gaussian. Many more deep central limit phenomena have been discovered in the recent past. Among others, there is a central limit theorem for the volume of convex hulls of Gaussian random vectors obtained by Bárány and Vu in [5] or Reitzner’s central limit theorems for the volume and the number of -dimensional faces of random polytopes in smooth convex bodies [20] that were obtained when the number of random points tends to infinity (see also Bárány and Thäle [4] and Thäle, Turchi, and Wespi [24] for results about general intrinsic volumes). There is a central limit theorem due to Paouris, Pivovarov, and Zinn [18] for the volume of -dimensional random projections of the -dimensional cube when , a result that had previously been obtained by Kabluchko, Litvak, and Zaporozhets [12] in the special case . Alonso-Gutiérrez, Prochno, and Thäle [1] proved a central limit theorem and Berry-Esseen bounds for the Euclidean norm of random orthogonal projections of points chosen uniformly at random from the unit ball of , as , and Kabluchko, Prochno, and Thäle [13] obtained a multivariate central limit theorem for the -norm of random vectors chosen uniformly at random in the unit -ball of , which extended the corresponding -dimensional result obtained by Schmuckenschläger [22].
While the results in the previous paragraph describe central limit phenomena for several geometry related quantities, there is considerably less known about the large deviations behavior. Large deviations principles, which appear on the scale of a law of large numbers, have only recently been introduced in geometric functional analysis by Gantert, Kim, and Ramanan [11], who obtained a large deviations principle for -dimensional random projections of -balls in , as the space dimension tends to infinity. Subsequent work of Alonso-Gutiérrez, Prochno, and Thäle [1] provided a description of the large deviations behavior for the Euclidean norm of projections of -balls to high-dimensional random subspaces (the so-called annealed case), and Kabluchko, Prochno, and Thäle [13] obtained a complete description of the large deviations behavior of -norms of high-dimensional random vectors that are chosen uniformly at random in an -ball, which can be seen as an asymptotic version of a result of Schechtman and Zinn [21].
The motivation for this manuscript is essentially three-fold and we shall discuss the details in the following subsections together with our corresponding results. The first is the aim for an extension of the (multivariate) central limit theorems obtained in [13, Theorem 1.1] and [22, Proposition 2.4] and the large deviations principles [13, Theorems 1.2 and 1.3] to a considerably wider class of distributions on -balls. The second aim is to go between the Gaussian fluctuations described by the central limit theorem and the large deviations and to describe the moderate deviations behavior of the random variables studied there. Moderate deviations are typically non-parametric (in contrast to large deviations) and consider probabilities on scales between those of a law of large numbers and a central limit theorem. These new findings for the moderate scaling therefore complement and refine both the new central limit theorems (Theorem A and Theorem B) as well the new large deviations principle (Theorem D). For a variety of applications of such results, despite the once presented below, we refer the reader to [13].
Before we present our results, let us explain the distributional set-up of this manuscript. As already mentioned, we consider a much more general class of distributions compared to [13] and [22]. Those have been introduced and studied by Barthe, Guédon, Mendelson, and Naor [6], and are closely related to the geometry of -balls. This class contains the uniform distribution considered in [1, 11, 13], the cone probability measure on the -unit ball as special cases, and many more (see below). As usual, denotes the -norm of the vector , and the parameter satisifes . For every , we let be any Borel probability measure on , be the uniform distribution, and be the cone probability measure on . The distributions we consider are of the form
[TABLE]
where the function is given by with
[TABLE]
In other words this means that
[TABLE]
for all non-negative measurable functions , where denotes the -sphere. The class of measures of the form contains the following important cases, which are of particular interest (see Theorem 1, Theorem 3, Corollary 3, and Corollary 4 in [6]):
- (i)
If is the exponential distribution with rate (and mean ), then , , and reduces to the uniform distribution on .
- (ii)
If is the Dirac measure concentrated at [math], then , , and is just the cone probability measure on .
- (iii)
If is a gamma distribution with shape parameter and rate , then is the beta-type probability measure on with Lebesgue density given by
[TABLE]
In particular, if for some , this is the image of the cone probability measure on under the orthogonal projection onto the first coordinates. Similarly, if , this distribution arises as the image of the uniform distribution on under the same orthogonal projection.
After having discussed the class of distributions we consider, we now turn to our main results.
Remark 1**.**
Note that although for the unit balls are not convex, we decided to include them into our analysis, simply because our results are valid in this regime as well. On the other hand, we leave out the case , since in this case we can only treat the uniform distribution on and this was already studied in [13].
1.1. Central limit theorems
The first result in this manuscript is a generalization of the central limit theorems [13, Theorem 1.1] and [22, Proposition 2.4] to the broader class of distributions presented above. While the result can in principle be proved in a multivariate form, we prefer to stay in the one-dimensional setting for clarity and for ease of comparison with the moderate and large deviations principles discussed in the next subsections. The theorem below describes the Gaussian fluctuations of the -norm of vectors chosen at random from the balls according to the measures . In this paper, we denote by and convergence in probability and in distribution, respectively. Moreover, we put
[TABLE]
for any .
Theorem A** (Central limit theorem).**
Fix and . Let be a sequence of Borel probability measures on . For each let be distributed according to and according to . Assume that
[TABLE]
Then
[TABLE]
where is a centered Gaussian random variable with variance
[TABLE]
Let us return to the situations (i)–(iii) described above and discuss some special cases of Theorem A. If for each , for some fixed Borel probability measure on , then assumption (3) is clearly satisfied. In particular, taking to be the Dirac measure at zero (recall (ii) above) or the exponential distribution with rate (recall (i) above), we recover the central limit theorem of Schmuckenschläger [22], see also Kabluchko, Prochno, and Thäle [13]. As another example, we fix a sequence of positive real numbers such that , as , and let for each , be the gamma distribution with shape parameter and some fixed rate . Markov’s inequality implies that (3) is satisfied in this case, from which the central limit theorem follows. In particular, taking we cover the situation discussed under (iii) above.
Remark 2**.**
In the case when , the asymptotic variance vanishes. In this case, Theorem A just states the distributional convergence of to [math].
Remark 3**.**
Theorem A should be compared with the (multivariate) central limit theorem for proved in [19]. The latter is valid under the condition that , as . One can in fact show (with some efforts, see the previous remark) that our condition (3) implies the one in [19]. However, we prefer to give an alternative and separate argument, since it can be developed further to give a proof of our MDP.
In one of our applications we present in Section 2 below, a slight generalization of Theorem A is needed, where we allow the random variables to converge to a non-trivial limiting distribution after a suitable centering and rescaling by .
Theorem B** (Generalized central limit theorem).**
Fix and . Let be a sequence of Borel probability measures on . For each let be distributed according to and according to . Assume that
[TABLE]
with for some , where is a sequence of non-negative real numbers satisfying , as . Then
[TABLE]
where is a centered Gaussian random variable with variance
[TABLE]
We emphasize that Theorem B is indeed a generalization of Theorem A. Namely, if (4) is satisfied with for all and then and hence , as , so that (3) is satisfied. Moreover, let us briefly mention that Theorem B allows us to consider, for example, a gamma distribution for with constant rate and shape parameter satisfying , as . We take advantage of this flexibility in Section 2 below.
1.2. Moderate deviations principle
We will next describe the moderate deviations. A moderate deviations principle (MDP) is formally nothing else than a large deviations principle (LDP) but with important differences in the behavior of the two principles. For instance, while LDPs provide estimates on the scale of a law of large numbers, MDPs describe the probabilities at scales between a law of large numbers and a distributional limit theorem (like a central limit theorem). Moreover, while the rate function in an LDP depends in a subtle way on the distribution of the underlying random variables, the rate function in an MDP in typical situations is non-parametric and given by the Gaussian one inherited from a central limit theorem. Let us recall that a sequence of random vectors in () satisfies an LDP with speed and *‘*good rate function’ if
[TABLE]
for all measurable ( being the interior and the closure of ), where is lower semi-continuous and has compact level sets , . We say in this paper that a sequence satisfies an MDP if the speed sequence is given by with a positive sequence satisfying and , where for two sequences and we use the Landau notation if and if . In our case, the random variables are suitably scaled versions of the -norm of random points in .
The following MDP complements both the central limit theorems (Theorem A, Theorem B and also [13, Theorem 1.1]) as well as the large deviations result proved in [13, Theorem 1.2] and Theorem D below.
Theorem C** (Moderate deviations principle).**
Fix and with . Let be a sequence of Borel probability measures on and be a sequence of positive real numbers satisfying and . For each let be distributed according to . Assume that, for all ,
[TABLE]
Then the sequence of random variables
[TABLE]
satisfies an MDP with speed and good rate function , , where is the variance from Theorem A.
In particular, Theorem C implies that, for all ,
[TABLE]
Let us briefly return to the special cases (i)–(iii). Clearly, if is the Dirac measure at zero, Assumption (5) is satisfied. This covers case (ii) from above. On the other hand, if for each , is a gamma distribution with shape parameter and rate , we can use the MDP for sums of independent random variables (see Lemma 11 below) to conclude that Assumption (5) is satisfied if . Especially, taking , this covers cases (i) and (iii).
Remark 4**.**
If , then the core term for the MDP that we study in Lemma 16 below vanishes and therefore, we do not obtain an MDP with a non-trivial rate function.
1.3. Large deviations principle
The third type of limit theorem we obtain is a large deviations principle. As we shall see in a moment, contrary to the quadratic and non-parametric rate function in the MDP, the LDP is more sensitive to the underlying distribution and displays a significant difference in behavior depending on the parameter and its relative position with respect to the parameter .
Theorem D** (Large deviations principle).**
Fix and with . Let be a sequence of Borel probability measures on and for each let be distributed according to . For each let be distributed according to . Then the sequence of random variables satisfies the following LDPs:
- (1)
If we assume that the sequence satisfies an LDP with speed and good rate function . Then the LDP is with speed and good rate function , where and is the Legendre-Fenchel transform of the function
[TABLE] 2. (2)
If we assume that sequence is exponentially equivalent to [math] in the sense that
[TABLE]
for all . Then the LDP is with speed and good rate function
[TABLE]
We emphasize that while the rate function for is universal in the sense that it does not depend on (provided that does not vanish in a neighborhood of ), this is not the case for the rate function for , which in a subtle way depends on . As examples we consider the special cases (i) and (ii) above. If for each , is the Dirac measure at zero, the function is given by
[TABLE]
Moreover, if is the exponential distribution with parameter for each , then
[TABLE]
Remark 5**.**
If , then the LDP of Theorem D (1) remains valid in a modified form. In fact, it still holds with speed , but the rate function is then given by , where is the Legendre-Fenchel transform of
[TABLE]
and is the function .
1.4. Structure
The remaining parts of this text are structured as follows. Two applications of our results to random and non-random projections of -balls are discussed in Section 2. In Section 3 we rephrase some preliminary results, which are used in proofs of Theorems A, B, C, and D. The latter are contained in Section 4. More precisely, we develop a crucial probabilistic representation for the involved random variables in Section 4.1 and then prove Theorem A in Section 4.2, Theorem B in Section 4.3, Theorem C in Section 4.4, and Theorem D in Section 4.5.
2. Application to projections of -balls
2.1. Random versus non-random subspaces
Projections of -balls to lower-dimensional subspaces were subject of a number of studies, see, e.g., [1, 2, 11, 14, 16, 17]. In these works two different set-ups were studied, one in which the subspace one projects onto is random, and another one, in which the choice of the subspace is deterministic (for an extensive comparison of both situations for one-dimensional projections we refer the reader to [10, 11]). We shall use the limit theory for the general distributions on -balls presented in the previous section to compare both approaches. We start by recalling the framework for projections onto random subspaces taken from [1, 2]. We let be a sequence of integers satisfying and , as . Moreover, for each , let be uniformly distributed on and let be a uniformly distributed -dimensional random subspace (where the uniform distribution refers to the Haar probability measure on the Grassmannian of all -dimensional linear subspaces in ). We assume that the two sequences and are independent. Moreover, we denote by the orthogonal projection of onto . The quantity studied in [1, 2] is the Euclidean norm of the projection of the random vector onto the random subspace , i.e., .
We first rephrase the central limit theorem [2, Theorem 1.1]. It says that if , as , then
[TABLE]
where is a centered Gaussian random variable with variance
[TABLE]
Observe that taking the constant coincides with from Theorem A if we take there.
Next, we recall the LDP for the same quantities from [1, Theorem 1.2] (for simplicity we restrict ourselves to the case , since only in this case an explicit form of the rate function is available). Using the same notation as before, it says that for any the sequence of random variables satisfies an LDP with speed and good rate function
[TABLE]
whenever .
The projections onto random subspaces as just described can be compared with projections onto sequences of deterministic subspaces. In fact, our distributional framework allows to deal with projections onto coordinate subspaces. Namely, let the sequence be as above and let, for each , be uniformly distributed in the -dimensional -ball with . We denote by the orthogonal projection of onto the first coordinates. Thus, is the projection from to , which in turn can be identified with .
Theorem E** (Central limit theorem for deterministic projections).**
Assume that , as . Then,
[TABLE]
where is a centered Gaussian random variable with variance
[TABLE]
Proof.
Recalling the special case (iii) for from the previous section, we see that the projected random vector has distribution on , where is a gamma distribution with shape parameter and rate . We are going to apply the central limit theorem to the gamma distribution with the aim of verifying condition (4) of Theorem B. Keeping in mind that is now the dimension parameter of the projection, we define
[TABLE]
In addition, we have that
[TABLE]
as . Assume, for a moment, that . Then, even though can vanish, we have . Under these circumstances, the central limit theorem is applicable to the gamma distribution and yields that
[TABLE]
On the other hand, if stays bounded, then stays bounded, hence the sequence is tight, and since (recall that ), we conclude that (9) still holds with . Summarizing, we conclude that (9) always holds under the assumptions of the theorem. Indeed assume that (9) is violated. Since the sequence has uniformly bounded variances, we could pass to a subsequence for which converges weakly to some distribution different from . Passing one more time to a subsequence, we could assume that either or is bounded. However, as was explained above, this would lead to a contradiction.
We can thus apply Theorem B with and dimension parameter instead of to conclude that
[TABLE]
where is a centered normal random variable with variance
[TABLE]
After recalling that , this can be written in the form
[TABLE]
To complete the proof, we need to replace the factor by . That this is always possible can be seen as follows. For we define
[TABLE]
Then (10) reads as , and our aim is to show that the same is true with replaced by . To this end we write
[TABLE]
Since , as , the first term converges in distribution to by Slutsky’s theorem, and it remains to prove that b_{n}\big{(}1-{a_{n}^{\prime}\over a_{n}}\big{)}\to 0. This is done as follows:
[TABLE]
Summarizing, we have shown that (10) is in fact equivalent to
[TABLE]
thus completing the proof. ∎
Theorem E, together with (7), leads us to the remarkable observation that we have the same central limit behavior regardless of whether we project onto uniform random subspaces of dimensions or onto deterministic coordinate subspaces of the same dimension, provided their dimension is sufficiently large, i.e., if as . Indeed, the centering in both results is the same, and it is easy to check that . On the other hand, if , we still have a central limit theorem for the (suitably centered and rescaled) quantities and , with the same centering, but this time with different limiting variances and , respectively.
A similar comparison as for the central limit theorem can be made on the large deviations scale. We restrict ourselves to the case and , that is . We are interested in large deviations of , which is distributed as the -norm of a random vector with the probability law on , where is the gamma distribution , as above. Let us check that the sequence of random variables with having distribution is exponentially equivalent to [math] in the sense of (6). Fix some . Since , the convolution property of the gamma distribution in its shape parameter entails that, for large , the random variable is stochastically dominated by a sum of i.i.d. -distributed random variables. Note that . Moreover, again for sufficiently large, . We deduce from this and Cramér’s theorem (see Lemma 8 below) that for large
[TABLE]
where is some constant depending on and , but since the dependence on can be omitted. Note that the above argument would fail if . Thus,
[TABLE]
since . In this case, Theorem D can be applied with and we obtain an LDP for with speed and the rate function given in Theorem D. Since , we conclude that satisfies an LDP with speed and the same rate function as in (8) with there. Again, this shows that the same large deviations behavior is present regardless of whether we project onto uniform random subspaces of dimensions or onto deterministic coordinate subspaces of the same dimension, again provided their dimension is sufficiently large in the sense that , as .
2.2. -dimensional random projections of -balls
In this section we present another application of our main results demonstrating the advantage of studying the more general distributions on the -balls. In [13, Corollary 2.6], we proved a generalization to -balls of a central limit theorem obtained by Paouris, Pivovarov, and Zinn [18, p. 703] and Kabluchko, Litvak, and Zaporozhets [12, Theorem 3.6] for the width of orthogonal projections of the -dimensional cube onto a uniformly distributed random direction. For with and a random vector chosen from with respect to the cone probability measure (which in this case coincides with the normalized spherical Lebesgue measure), it was shown in [13] that, as ,
[TABLE]
where is a centered Gaussian random variable with variance
[TABLE]
Here, denotes the Hölder conjugate of satisfying , denotes the orthogonal projection onto the line spanned by , and
[TABLE]
While the argument to obtain this central limit theorem had to be extracted from the proof of the main result [13, Theorem 1.1], it is in our set-up a direct consequence of Theorem A, since we study more general distributions for which the choice and yields that is just the cone probability measure on . More precisely, to obtain the central limit theorem above, we use the representation (12) and apply Theorem A with the choice , , replaced by , and take .
Beyond the Gaussian fluctuations just described, our results in Theorems C and D concerning moderate and large deviations allow us to deduce the complementing MDPs and LDPs for the length of the orthogonal projection of onto a random direction as well. We start with the description of the moderate deviations behaviour. Using Theorem C with the choice , , and replaced by with , we obtain that the sequence of random variables
[TABLE]
satisfies an MDP with speed and good rate function , where is a sequence of positive real numbers satisfying and , and the constant is as in (11).
The large deviations are obtained similarly. Using Theorem C with the choice , , and replaced by with (we restrict ourselves to this case, since only in this case we have a closed form expression for the rate function), we obtain that the sequence of random variables
[TABLE]
satisfies an LDP with speed and good rate function
[TABLE]
Finally, we mention that the constant can be explicitly expressed as
[TABLE]
in terms of the parameter .
3. Preliminaries
In this section we briefly present some background material used throughout the rest of this text. For convenience of the reader, we split this into different subsections that may be skipped depending on the reader’s background.
3.1. Generalized Gaussian random variables
Let us denote, for , by a sequence of independent copies of a -generalized Gaussian random variable with Lebesgue density
[TABLE]
where the normalization constant is given by . Next, recall the definition of the constant from (2). It can be used to express first- and second-order moments of -generalized Gaussian random variables as follows. Namely, for we have that
[TABLE]
see [2, Lemma 3.1]. Note that .
The family of -generalized Gaussian random variables can be used to describe a probabilistic interpretation of the distributions that were defined in the introduction. This interpretation is one of the key devices in the proofs of Theorems A, C, and D.
Lemma 6** (Probabilistic interpretation, Theorem 3 in [6]).**
Let , be a random vector of independent and -generalized coordinates, and assume that is a non-negative random variable with distribution , which is independent of . Then the random vector
[TABLE]
is distributed according to the measure .
3.2. Moderate and large deviations
Let be a sequence of random vectors on some probability space taking values in a Hausdorff topological space . Further, let be an increasing sequence of real numbers and be a lower semi-continuous function with compact level sets for all . One says that satisfies a large deviations principle (LDP) on with speed and good rate function , provided that
[TABLE]
for all Borel sets , where denotes the interior and the closure of . As already discussed in the introduction, a moderate deviations principle (MDP) is formally the same as an LDP, but on a different rage of scales.
We shall now present a few basic results from large deviations theory which are needed below. Assume that a sequence of random variables satisfies an LDP with speed and rate function . Suppose now that is a sequence of random variables that are ‘close’ to the ones from the first sequence. The next result provides conditions under which in such a situation an LDP from the first can be transferred to the second sequence.
Lemma 7** (Exponential equivalence, Theorem 4.2.13 in [8]).**
Let and be two sequence of -valued random vectors and assume that satisfies an LDP on with speed and rate function . Further, suppose that the two sequences and are exponentially equivalent, which is to say that
[TABLE]
for any . Then satisfies an LDP on with the same speed and the same rate function.
Next, we recall what is known as Cramér’s theorem. It provides an LDP for sequences of independent and identically distributed random variables.
Lemma 8** (Cramér’s theorem, Theorem 2.2.3 in [8]).**
Let be a sequence of i. i. d. random variables. Assume that for all for some . Then the sequence of random variables satisfies an LDP on with speed and good rate function \mathbb{I}(x)=\sup\big{\{}\lambda x-\log\mathbb{E}e^{\lambda X_{1}}:\lambda\in\mathbb{R}\big{\}}, i.e., is the Legendre-Fenchel transform of the log-moment generating function .
Let and suppose that is a sequence of -valued random vectors and that is a sequence of -random vectors. We assume that both sequences satisfy LDPs with the same speed. The next result, taken from [1, Proposition 2.4], yields that also the sequence of -valued random vectors satisfies an LDP and provides the form of the rate function.
Lemma 9**.**
Assume that satisfies an LDP on with speed and good rate function and that satisfies an LDP on with speed and good rate function . Then, if and are independent for each , the sequence of random vectors satisfies an LDP on with speed and good rate function given by , .
Finally, we consider the possibility to transport a large deviations principle to another one by means of a continuous function, a result which is known as the so-called contraction principle.
Lemma 10** (Contraction principle, Theorem 4.2.1 in [8]).**
Let and be two Hausdorff topological space and let let be a continuous function. Further, let be a sequence of -valued random elements that satisfies an LDP with speed and good rate function . Then the sequence of -valued random elements satisfies an LDP with the same speed and with good rate function , i.e.,
[TABLE]
with the convention that if .
As explained before, a moderate deviations principle is formally nothing else than a large deviations principle and describes (in our set-up) the deviation probabilities at scales between a law of large numbers and a central limit theorem. An important tool for us will be the following MDP for sums of independent and identically distributed random vectors.
Lemma 11** (MDP for sums of random vectors, Theorem 3.7.1 in [8]).**
Let be a sequence of independent and identically distributed random vectors in and let be sequence of positive real numbers such that and . We assume that is centered, its covariance matrix is invertible, and for all in a ball around the origin having positive radius. Then the sequence of random vectors , , satisfies an LDP with speed (i.e., an MDP) and good rate function , .
Remark 12**.**
There exist versions of Lemma 11 under less restrictive assumptions on the (exponential) moments of the involved random vectors, see [3], for example. However, such results do not lead to simplifications or improvements in our situation.
4. Proof of the main results
4.1. A probabilistic representation for the -norm
In a first step we develop a probabilistic representation for the random variables , which will turn out to be useful for both, the proof of the central limit theorems and the moderate deviations principle. In what follows we let be a sequence of independent -generalized Gaussian random variables and define, for each ,
[TABLE]
where .
Lemma 13** (Probabilistic interpretation).**
Fix , and . Let be a Borel probability measure on . Let be distributed according to and be distributed according to and independent of . Then
[TABLE]
where is such that, for some , we have whenever .
Proof.
We first observe that as a consequence of Lemma 6 the random vector has the probabilistic representation
[TABLE]
where is a vector of independent -generalized Gaussian random variables and is a random variable with distribution , which is independent of . Thus
[TABLE]
Recalling the definitions of the random variables and , we can rewrite the last expression as
[TABLE]
Next, we define the function
[TABLE]
where stands for the domain of . Clearly, some open neighborhood of is contained in , and a Taylor expansion of around shows that for all ,
[TABLE]
where the function is such that, for some , we have whenever . Combining this with the representation (14) for proves the claim. ∎
4.2. Proof of the central limit theorem (Theorem A)
For each let us define the random variable
[TABLE]
It follows from Lemma 13 that
[TABLE]
For any , we decompose into the random variables
[TABLE]
Slutsky’s theorem (see [7, Proposition A.42 (b)]) completes the proof of Theorem A once we show that
[TABLE]
where is the centered Gaussian random variable as in Theorem A.
Assumption (3) says that converges in distribution to [math], as . Therefore, the multivariate central limit theorem applied to and the continuous mapping theorem yield
[TABLE]
where is a centered Gaussian random vector in with covariance matrix given by
[TABLE]
As a consequence, is a centered Gaussian random variable with variance
[TABLE]
Finally, we shall argue that . To this end, we write
[TABLE]
Since there exist such that whenever , we obtain
[TABLE]
The weak law of large numbers ensures that, as , the first two probabilities converge to zero, while our assumption (3) on the random variables ensures that the last probability tends to zero as well. Thus, for any , we have that
[TABLE]
Again by the weak law of large numbers we have that and both converge to zero in probability, as . Moreover, the central limit theorem implies that and converge in distribution to non-degenerate Gaussian random variables. Hence, Slutsky’s theorem implies that and both converge to zero in distribution. Since the random variables are defined on the same probability space and because the limit is (almost surely) constant, we even have that and converge to zero in probability. Finally, also converges to zero in probability by our assumption (3). Thus, the first probability in the last expression also converges to zero, while the second summand has already been treated before. As a consequence, we conclude that indeed
[TABLE]
which completes the argument.
4.3. Proof of the generalized central limit theorem (Theorem B)
Since the proof of Theorem B is very similar to the one of Theorem A, we restrict ourselves to the details that need to be adapted.
First of all, we recall that
[TABLE]
Then, following with minimal changes the proof of Lemma 13, we obtain
[TABLE]
We define for each the random variable
[TABLE]
In the same way as in the proof of Lemma 13, one shows that with
[TABLE]
Thus, using Slutsky’s theorem, we conclude the result of Theorem B once we have shown that
[TABLE]
where is the Gaussian random variable from the statement of Theorem B.
We start with the assertion on the sequence . First of all, we notice that by Assumption (4), the multivariate central limit theorem applied to , and the continuous mapping theorem,
[TABLE]
where is independent of the centered Gaussian random vector in with covariance matrix given by
[TABLE]
The limiting variable is centered Gaussian. To compute its variance, observe that
[TABLE]
Thus, the limiting variance is given by
[TABLE]
where the second line follows by recalling (2) and performing computations with gamma functions.
To show that , as , we can in principle follow the lines of the proof of Theorem A, but we have to replace the terms and there by and , respectively. In particular, in a first step this results in showing that both sequences converge in distribution to [math], that is for every fixed ,
[TABLE]
as . Both claims easily follow from the Slutsky theorem after recalling that both and converge in distribution to normal random variables, and that .
Moreover, in a second step one needs to argue that for any fixed ,
[TABLE]
as . Recall that all three sequences converge in distribution to normal random variables. For the former two sequences, this follows from the central limit theorem, whereas the claim for is a consequence of our assumption (4). Again by a Slutsky-type argument, the sequences , and converge to zero in probability, hence so does their sum. This establishes (17) and hence (16), which completes the proof of Theorem B.
4.4. Proof of the moderate deviations principle (Theorem C)
Let be a sequence of positive real numbers such that and . As in the proof of the central limit theorem, we consider the sequence of random variables
[TABLE]
and observe that Lemma 13 implies
[TABLE]
where is such that whenever for some .
Our strategy to prove the moderate deviations principle of Theorem C is as follows:
We prove a bivariate moderate deviations principle for the sequence of rescaled random vectors in .
- 2.
We apply the contraction principle to deduce a moderate deviations principle for the linear combination .
- 3.
We show that the sequence of random variables is exponentially equivalent to the sequence formed in step 2.
We start with the first step of the proof.
Lemma 14** (Bivariate MDP).**
Fix and with . Let be a sequence of positive real numbers such that and and consider the random vectors
[TABLE]
Then the sequence of random vectors satisfies an MDP on with speed and good rate function
[TABLE]
where .
Proof.
First, we observe that is a sum of centered i. i. d. random vectors in with covariance matrix
[TABLE]
given by
[TABLE]
The moment generating function of the random vector on is given by
[TABLE]
Since , the function is finite on , a set which contains the origin in its interior. Therefore, Lemma 11 (with the choice there) implies that the sequence of random variables satisfies an MDP on with speed and good rate function
[TABLE]
Inserting the values for and , and simplifying the resulting expression proves the claim. ∎
Remark 15**.**
In the previous proof we used our assumption that in order to verify the finiteness of certain exponential moments. As already discussed in Remark 12 above, there exist version of the MDP for sums of independent random vectors not requiring the finiteness of such exponential moments. However, also when applying such weaker versions from [3], for example, the assumption that is in fact needed.
We continue with the second step and use the contraction principle to obtain an MDP the linear combinations of and .
Lemma 16** (MDP for the core term).**
Let be s sequence of positive real numbers such that and . Then the sequence of random variables
[TABLE]
satisfies an MDP on with speed and good rate function , where is the constant in Theorem A.
Proof.
Consider the continuous function
[TABLE]
and observe that, for each , the random variable has the same distribution as , where was defined in Lemma 14. Thus, the contraction principle (see Lemma 10) implies the desired MDP with speed and good rate function
[TABLE]
This optimization problem leads us to the Lagrangian
[TABLE]
and the Lagrange multiplier equations
- (i)
, 2. (ii)
, 3. (iii)
,
where , , and are the entries of the covariance matrix given by (19). This yields the critical value
[TABLE]
and from a direct (but tedious) computation, we obtain the explicit quadratic form of the rate function. We refrain from providing the details of the computation. ∎
We will now proceed with the third step and prove the exponential equivalence. In what follows, we let the random vectors be as in (18), the random variables as in (15), and the function be given by (20).
Lemma 17** (Exponential equivalence - MDP).**
Let be s sequence of positive real numbers such that and . Then the sequences of random variables and are exponentially equivalent.
Proof.
We start by recalling that, for each ,
[TABLE]
and
[TABLE]
Let us fix . We observe that
[TABLE]
where we used that is a non-negative random variable for each . The function is the same as in Lemma 13. Assumption (5) (with there) implies
[TABLE]
To discuss the second term, we first write
[TABLE]
where is the parameter from Lemma 13. For the first summand in the previous expression, we obtain the estimate
[TABLE]
The first two terms both decay like for a suitable by Cramér’s theorem (see Lemma 8). For the last term, we use again condition (5) (with there) and obtain
[TABLE]
where we also used that . As a consequence,
[TABLE]
since and . Recalling the definition and the properties of the function from Lemma 13, we obtain for sufficiently large
[TABLE]
where we also used that . Again by Cramér’s theorem (see Lemma 8), the first two terms decay like for suitable and their sum is bounded by for sufficiently large . Using this together with assumption (5), we obtain
[TABLE]
where we used that . Putting everything together and using [8, Lemma 1.2.15], we get
[TABLE]
Since was arbitrary, this shows the exponential equivalence that was claimed in the lemma. ∎
Proof of Theorem C.
The MDP is now a direct consequence of Lemma 7 together with the MDP for the core term (see Lemma 16) and the exponential equivalence (see Lemma 17). ∎
4.5. Proof of the large deviations principles (Theorem D)
In this last section we present the proof of the large deviations principles in Theorem D. On the way, we shall use some results we have obtained in [13]. In what follows, we assume that for each , is a vector of independent -generalized Gaussian random variables, and we assume that and are independent.
4.5.1. The case
We start by recalling that, for each , we have the distributional equality
[TABLE]
see the proof of Lemma 13. In the proof of Theorem 1.2 in [13], we have already seen that the sequence of random vectors
[TABLE]
satisfies an LDP on with speed and a good rate function . More precisely, thanks to Cramér’s theorem (see Lemma 8) can be identified as the Legendre-Fenchel transform of the function
[TABLE]
Since and are assumed to be independent and since satisfies an LDP with speed and good rate function , the sequence of random vectors
[TABLE]
satisfies an LDP on with good rate function given by
[TABLE]
where we used Lemma 9. Next, we consider the mapping
[TABLE]
which is continuous on its domain. Clearly,
[TABLE]
for each . Therefore, we can apply the contraction principle (see Lemma 10) to conclude that satisfies an LDP with speed and good rate function . This completes the argument.
Remark 18**.**
The assumption that was used only in disguise above and is behind the LDP for the sequence of random vectors in (21). Indeed and as indicated above, the proof of this LDP is based on Cramér’s theorem, which in turn requires finiteness of some exponential moments, or equivalently, that the origin is an interior point of the domain of the function defined in (22). However, from the definition of this function it is clear that this can only be the case if .
4.5.2. The case
As was shown in the proof of [13, Theorem 1.3], the sequence of random variables
[TABLE]
satisfies an LDP with speed and good rate function
[TABLE]
We will now prove that the two sequences and are exponentially equivalent.
Lemma 19** (Exponential equivalence - LDP).**
The sequences and are exponentially equivalent with rate .
Proof.
As we have seen in the proof of Lemma 13, one has that
[TABLE]
for each . Let . Then, for every , we obtain
[TABLE]
Let us consider the second term. Write
[TABLE]
and note that and for all . This leads to the estimate
[TABLE]
By Cramér’s theorem, the first term in the previous line decays exponentially like , since for all . In fact, the rate function in the corresponding LDP does not vanish in , where is an open neighborhood of , which implies that the constant stays strictly positive when letting . In combination with our Assumption (6) (applied with ) this shows that
[TABLE]
In the first inequality above, we have used the elementary fact (see, e.g., [8, Lemma 1.2.15]) that for families of non-negative real numbers , one has that
[TABLE]
The remaining term
[TABLE]
can be treated in the same way.
Putting everything together, we obtain from the LDP for the sequence that
[TABLE]
When , the expression above tends to . Hence, the two sequences and are indeed exponentially equivalent. ∎
Proof of Theorem D, part (2) .
The proof of is now a direct consequence of Lemma 19 combined with the fact that satisfies an LDP with speed and good rate function given by (23). ∎
Acknowledgement
We would also like to thank Nicola Turchi for exchanges about the topics of this paper.
ZK has been supported by the German Research Foundation under Germany’s Excellence Strategy EXC 2044 – 390685587, Mathematics Münster: Dynamics - Geometry - Structure. JP has been supported by a Visiting International Professor Fellowship from the Ruhr University Bochum and its Research School PLUS, by the Austrian Science Fund (FWF) Project F5508-N26, which is part of the Special Research Program “Quasi-Monte Carlo Methods: Theory and Applications”, and by the FWF Project P32405 “Asymptotic Geometric Analysis and Applications”. ZK and CT have been supported by the DFG Scientific Network Cumulants, Concentration and Superconcentration.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. Alonso-Gutiérrez, J. Prochno, and C. Thäle. Large deviations for high-dimensional random projections of ℓ p n superscript subscript ℓ 𝑝 𝑛 \ell_{p}^{n} -balls. Adv. in Appl. Math. , 99:1–35, 2018.
- 2[2] D. Alonso-Gutierrez, J. Prochno, and C. Thäle. Gaussian fluctuations for high-dimensional random projections of ℓ p n superscript subscript ℓ 𝑝 𝑛 \ell_{p}^{n} -balls. Bernoulli , 2019+.
- 3[3] M.A. Arcones. Moderate deviations of empirical processes. In Stochastic inequalities and applications , volume 56 of Progr. Probab. , pages 189–212. Birkhäuser, Basel, 2003.
- 4[4] I. Bárány and C. Thäle. Intrinsic volumes and Gaussian polytopes: the missing piece of the jigsaw. Doc. Math. , 22:1323–1335, 2017.
- 5[5] I. Bárány and V. Vu. Central limit theorems for Gaussian polytopes. Ann. Probab. , 35(4):1593–1621, 2007.
- 6[6] F. Barthe, O. Guédon, S. Mendelson, and A. Naor. A probabilistic approach to the geometry of the ℓ p n subscript superscript ℓ 𝑛 𝑝 \ell^{n}_{p} -ball. Ann. Probab. , 33(2):480–513, 2005.
- 7[7] R.F. Bass. Stochastic Processes , volume 33 of Cambridge Series in Statistical and Probabilistic Mathematics . Cambridge University Press, Cambridge, 2011.
- 8[8] A. Dembo and O. Zeitouni. Large Deviations. Techniques and Applications , volume 38 of Stochastic Modelling and Applied Probability . Springer-Verlag, Berlin, 2010. Corrected reprint of the second (1998) edition.
