The variance of the $\ell_p^n$-norm of the Gaussian vector, and Dvoretzky's theorem
Anna Lytova, Konstantin Tikhomirov

TL;DR
This paper provides a complete characterization of the variance of the $ ext{l}_p^n$-norm of Gaussian vectors for all $p$, revealing two transition points and implications for Dvoretzky's theorem.
Contribution
It fully determines the variance of the $ ext{l}_p^n$-norm of Gaussian vectors across all $p$, including the logarithmic regime, and identifies two key transition points.
Findings
Variance behavior changes at two transition points in $p$.
Complete characterization of variance for all $p$ in relation to Gaussian vectors.
Implications for random Dvoretzky's theorem in $ ext{l}_p^n$ spaces.
Abstract
Let be a large integer, and let be the standard Gaussian vector in . Paouris, Valettas and Zinn (2015) showed that for all , the variance of the --norm of is equivalent, up to a constant multiple, to , and for , . Here, are universal constants. That result left open the question of estimating the variance for logarithmic in . In this note, we resolve the question by providing a complete characterization of for all . We show that there exist two transition points (windows) in which behavior of , viewed as a function of , significantly changes. We also discuss some implications of our result in context of random Dvoretzky's theorem for .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The variance of the –norm of the Gaussian vector, and Dvoretzky’s theorem
Anna Lytova111University of Opole, Poland; email: [email protected]. A significant part of this work was done when A.L. was visiting Princeton University in January–February, 2017 and Konstantin Tikhomirov222Princeton University, NJ; email: [email protected]. The research is partially supported by the Simons Foundation.
Abstract
Let be a large integer, and let be the standard Gaussian vector in . Paouris, Valettas and Zinn (2015) showed that for all , the variance of the –norm of is equivalent, up to a constant multiple, to , and for , . Here, are universal constants. That result left open the question of estimating the variance for logarithmic in . In this note, we resolve the question by providing a complete characterization of for all . We show that there exist two transition points (windows) in which behavior of significantly changes. We also discuss some implications of our result in context of random Dvoretzky’s theorem for .
MSC 2010: 46B06, 46B09, 52A21, 60E15, 60G15
Keywords and phrases: spaces, variance of norm, Dvoretzky’s theorem, order statistics
1 Introduction
Let be a large integer, be a number in , and denote by the standard –norm in . Let be the standard -dimensional Gaussian vector. Variance of the –norm of may serve as a basic example of the concentration of measure phenomenon (most of the Gaussian mass is located in a thin shell of an appropriately rescaled –ball). It is well known that for a fixed , , where the quantity depends only on and not on (see, in particular, [17] and [21]), whereas the variance of the –norm of is of order (see, for example, [4, p. 47–48] and [21]). At the same time, for growing to infinity with , no sharp results were available until quite recently. In [21], Paouris, Valettas and Zinn showed that for and for ( being universal constants). This result of [21] leaves the gap in which the behavior of the variance was not clarified. The authors of [21] conjectured that the variance changes from polynomially small in to logarithmic around . This conjecture was the starting point of our work.
The question of computing the Gaussian variance of the –norm seems natural on its own right; nevertheless, it gains more sense in the context of asymptotic geometric analysis. Since the fundamental discovery of Milman [13], it is known that Gaussian concentration properties of a norm in are strongly connected with geometry of random subspaces of . The classical theorem of Dvoretzky [8] asserts that every infinite-dimensional Banach space contains finite subspaces of arbitrarily large dimension which are arbitrarily close to Euclidean (in the Banach–Mazur metric). Milman showed in [13] that a stronger result takes place. Given a norm in , a subspace and a real number , we will (rather, unconventionally) call the subspace -spherical if . The theorem of Milman states that for any norm in with the Lipschitz constant and any , the random \frac{c\varepsilon^{2}}{\log(1/\varepsilon)}\big{(}\frac{{\mathbf{E}}\|G\|}{L}\big{)}^{2}–dimensional subspace of with uniform (rotation-invariant) distribution is –spherical with probability close to . In particular, the Dvoretzky–Rogers lemma implies that for any norm with the unit ball in John’s position, the random –dimensional subspace is –spherical with large probability. We refer to monographs and surveys [14, 22, 26, 1] for more information as well as to papers [21, 18, 19, 29, 20] for some recent developments of the subject. In this text, we leave out any discussion of the existential Dvoretzky theorem which is concerned with finding at least one large almost Euclidean subspace (the best known general result in this direction is due to Schechtman [24]) as well as the isomorphic Dvoretzky theorem which deals with the regime when distortion grows to infinity with (see, in particular, [15]).
In the regime of “constant distortion” (say, when ) the result of Milman is sharp, that is, if a random -dimensional subspace is –spherical with high probability then necessarily k\leq C\big{(}\frac{{\mathbf{E}}\|G\|}{L}\big{)}^{2} (see Milman–Schechtman [16] and Huang–Wei [11] for reverse estimates matching Milman’s bound). However, when tends to zero with , the original estimate is suboptimal. Gordon [10] and later Schechtman [23] improved the dependence on from to , which is sharp for some norms but not in general. For example, it was shown in [25] and [28] that a random –dimensional subspace of is –spherical with probability close to one if and only if . Moreover, for -unconditional norms in the -position, it was proved in [29] that random –dimensional subspaces are –spherical with high probability. For arbitrary norms, the problem of interdependence between and the dimension in the random Dvoretzky theorem is wide open, and even in the class of –spaces there is no complete solution as of this writing.
A considerable progress in estimating the distortion (in the “almost isometric” regime) of uniform random subspaces of for all was due to Naor [17] and Paouris, Valettas and Zinn [21]. For a fixed , Naor [17] obtained concentration inequalities which, in particular, can be employed to show that random –dimensional sections of the –ball are –spherical with probability close to one whenever (where depends only on and is a quantity of order polylogarithmic in arising from the application of the covering argument). The bound on the dimension of typical -Euclidean subspaces of (for depending only on ) was confirmed by Paouris, Valettas and Zinn [21] in the range , and it was shown that for a fixed the estimate is close to optimal. The paper [21] provides bounds (upper and lower) for the Dvoretzky dimension, as well as concentration inequalities for the standard Gaussian vector and the Gaussian variance in different regimes giving an emphasis to the case when grows with . However, for logarithmic in , the results are not sharp.
In the context of Dvoretzky’s theorem, the –spaces for logarithmic supply rather interesting geometric examples. As was observed in [21], there are universal constants such that, say, , whereas ; thus, the variance can be quite sensitive to replacing a norm with an equivalent norm. Note that the bounds for the variance immediately imply that, for example, the random -dimensional subspace of is –spherical with probability at least for a universal constant (and instead of we can take any constant dimension). At the same time, most of –dimensional subspaces of (which is a constant Banach–Mazur distance away from ) are not even –spherical [28]. The result of [21] leaves open the question whether there is a “phase transition” point such that for any and all sufficiently large we have and , where depends only on . Our result answers this question and completely settles the problem of computing the Gaussian variance of –norms. Below, for any two quantities we write “” if for a universal constant .
Theorem A**.**
There is a universal constant with the following property. Let and let be the standard Gaussian vector in . Further, denote by the quantile of order with respect to the distribution of the absolute value of a standard Gaussian variable , i.e. such that . Then
- •
For all in the range we have
[TABLE]
- •
For we have
[TABLE]
- •
For we have
[TABLE]
Everywhere in this note, “” stands for the natural logarithm. As we mentioned before, the above estimates in the regimes and were previously derived in [21]. The variance of , the way we represent it, is a piece-wise function, with the pieces equivalent at respective boundary points. The points and (see (8)) are chosen rather arbitrarily in a sense that each one can be shifted to the right or to the left by a small constant multiple of , which would change the estimates only by a multiplicative constant. In this connection, we prefer to speak about “transition windows” rather than “transition points”.
To have a better picture of how the variance changes with , it may be useful to consider its logarithm in the range for some fixed small constant , so that the term is bounded. For , is an almost linear function of . In the range , is essentially of order (up to a bounded multiple). In the intermediate regime , disregarding additive terms double logarithmic in , behaves as , which is a convex function close to parabola near the point .
Our result implies, in particular, that for any fixed and all sufficiently large , we have whereas for some depending only on . Observe that the Banach–Mazur distance between and is of order , so the “power of to logarithmic” transition happens at an almost isometric scale. In the context of the random Dvoretzky theorem, this implies
Corollary B**.**
For any , there are depending on with the following property. For any and , let , and let be a uniformly distributed random -dimensional subspace of . Then
[TABLE]
At the same time,
[TABLE]
Our result highlights an interesting characteristic of the order statistics of the standard Gaussian vector . Let be the non-increasing rearrangement of the vector of absolute values . Then Chernoff–type estimates imply that order statistics for relatively large, say, at least a positive constant power of , are strongly concentrated, so that their typical fluctuations are small (at most a negative constant power of ). Thus, the large (logarithmic in ) fluctuation of for is due to the fact that the -th powers of the first few order statistics comprise a relatively large portion of the sum with a significant probability, whereas for the -th powers of the first order statistics are typically hugely dominated by the total sum .
Our technique of proving Theorem A is in certain aspects similar to [21]. As in [21], a crucial role in our argument is played by Talagrand’s bound (see Theorem 2.1 in the next section), which allows to get sharper estimates for the variance than the Poincaré inequality. Another important step, also presented in [21], consists in obtaining strong upper bounds for negative moments of –norms. Our approach to bounding the moments is completely different from the one used in [21], as, instead of relying on general Gaussian inequalities, we employ a rather elementary but efficient technique involving lower deviation estimates for the order statistics of random vectors. This allows us to get strong estimates including the case i.e. in the range not treated in [21]. A principal new ingredient to our proof, compared to [21], is the use of truncated Gaussians. For a number , we consider an auxiliary function
[TABLE]
and use a trivial inequality . It turns out that, for a carefully chosen truncation level (the right choice is not straightforward), both terms in the last inequality can be estimated in an optimal way by combining Talagrand’s theorem with rather elementary probabilistic arguments and bounds for truncated moments of Gaussian variables. The truncation technique is also used to obtain matching lower bounds for the variance. We will discuss this approach in more detail at the beginning of Section 4.
The organization of the rest of the paper is as follows. In Section 2 we discuss notation and state several facts important for our work as well as provide a detailed derivation of upper and lower bounds for truncated Gaussian moments (Section 2.1) and lower deviation estimates for Gaussian order statistics, using Chernoff’s inequality (Section 2.2). In Section 3 we provide upper bounds for the negative moments of –norms in terms of quantiles of the Gaussian distribution. In Section 4 we obtain upper bounds for the variance, and in Section 5 derive matching lower bounds. For reader’s convenience, we give a proof of Corollary B in Section 6.
2 Preliminaries
Let us start with notation and some basic facts that will be useful for us. The canonical inner product in is denoted by . Given a vector and a real number , the standard –norm of is defined as
[TABLE]
Additionally, the –norm . The following relation is true for any :
[TABLE]
Given a real number , is the largest integer not exceeding . Universal constants are denoted by , etc. and their value may be different on different occasions. Given two quantities and , we write whenever there is a universal constant with . Further, for two non-negative quantities we write () if there is a universal constant with (respectively, ). Sometimes it will be convenient for us to write the relation as .
The expectation of a random variable will be denoted by , the variance — by , and the median — by . Given an event , by we denote the indicator function of . Throughout the text, standard Gaussian variables will be denoted by and the standard Gaussian vector in — by . It is well known (see, for example, [9, Chapter 7]) that the Gaussian distribution satisfies the relations
[TABLE]
The absolute moments of a standard Gaussian variable are given by
[TABLE]
The next theorem is a basis for our analysis; its “discrete” version was proved by M. Talagrand in [27].
Theorem 2.1** **(Talagrand’s bound; see
Suppose is an absolutely continuous function in and let () be the partial derivatives of . Then we have
[TABLE]
where is a universal constant.
2.1 Bounds for truncated moments of Gaussian variables
In this subsection, we derive rather elementary upper and lower bounds for high moments of random variables of the form and for a fixed . The results presented here are by no means new, but may be hard to locate in literature. For reader’s convenience, we provide proofs.
Let us start with a simple calculus lemma.
Lemma 2.2**.**
Fix . Let be a positive log-concave function on , and let be a point of global maximum for . Define
[TABLE]
Then
[TABLE]
As a consequence of the above statement, we get
Lemma 2.3**.**
Let be some real numbers.
- •
If then
[TABLE]
- •
If then
[TABLE]
Proof.
It is easy to see that is the point of global maximum of the log-concave function on , and . We will use the last lemma to evaluate the integrals.
The case . We can assume without loss of generality that is large (greater than a large absolute constant). To get the desired bound it is enough to show that , where are the two solutions of the equation . We can rewrite the equation in the form
[TABLE]
Since is large, we can assume that all solutions of the last equation satisfy for a small constant . Then, using Taylor’s expansion for the logarithm, we obtain
[TABLE]
for the two solutions of (6). Hence,
[TABLE]
The result follows.
The case . Let , , be defined as in Lemma 2.2. We have , and . To get the desired bounds, it suffices to show that there exist constants , such that
[TABLE]
where , and then apply Lemma 2.2. We will rely on the fact that is strictly increasing on .
For a sufficiently small universal constant we have , . Hence, for we obtain
[TABLE]
where in the last inequality we used the condition and the fact that is small. Thus, whence .
Now, choose , where is a large enough universal constant (say, definitely suffices). If then obviously , and we are done. Otherwise, we use the trivial relation , , to obtain
[TABLE]
When , the last term is less than , whereas for , the second term is less than . In any case, we get , whence . The result follows. ∎
Corollary 2.4**.**
Let be some real numbers.
(i) If then
[TABLE]
(ii) If then
[TABLE]
(iii) In particular, if for some , then
[TABLE]
Proof.
Since
[TABLE]
then, applying (4), we get
[TABLE]
This and the second part of Lemma 2.3 yield the assertion for and for . The case follows from the first part of Lemma 2.3 and the fact that . ∎
Remark 2.5**.**
The last statement asserts that, for , the -th moments of the truncated variables and are equivalent, with a constant multiple, to the (not truncated) absolute moment (see 5).
2.2 Chernoff–type bounds for order statistics
Given any number , the quantile of order with respect to the distribution of is the number satisfying . It follows from (4) that
[TABLE]
Standard estimates for quantiles of the Gaussian distribution (see, for example, [7, p. 264]) imply that for we have
[TABLE]
Further, for the standard Gaussian vector in , the order statistics of , denoted by , are the non-increasing rearrangement of the vector of absolute values . Given , we have
[TABLE]
It follows from Chernoff’s theorem for the partial binomial sums (see [5] for the original result, or [2, p. 24] as a modern reference) that for we have
[TABLE]
Applying the relation (), we get
[TABLE]
The relation (9) allows to derive deviation inequalities for order statistics. Let us remark at this point that, although order statistics are systematically studied in literature (see classical book [7], or paper [3] as an example of recent developments), we were not able to locate results in a form convenient for us. For completeness, we provide proofs of next three lemmas.
Lemma 2.6** (Lower deviation for large order statistics).**
There are universal constants with the following property. Assume that is large, and that . Let . Then
[TABLE]
Proof.
Let satisfy the assumptions and let be such that . Observe that, in view of the lower bound on and the approximation formula (8), we have . Then, applying (7) twice, we get
[TABLE]
where, by (8), we have . Thus,
[TABLE]
The assumptions and imply that
[TABLE]
which is bigger than a large absolute constant if is large enough, whence . Applying (9), we get
[TABLE]
It remains to reuse (10). ∎
Lemma 2.7** (Lower deviation for intermediate order statistics).**
There is a universal constant with the following property. Let be large, let and . Then
[TABLE]
Proof.
As in the proof of the above lemma, we let be such that . Denoting by the cdf of , we have
[TABLE]
[TABLE]
and
[TABLE]
for a sufficiently small universal constant . Finally, in view of (9),
[TABLE]
∎
The two lemmas above need to be complemented with the following crude bound for probability of very large deviations.
Lemma 2.8**.**
Let and . Then
[TABLE]
Proof.
We have
[TABLE]
∎
3 Negative truncated moments of –norms
In this section, we derive upper bounds for expressions of the form
[TABLE]
where the numbers and are such that , and is a truncation level which can take any value in the range . In particular, for the above quantity is the -th moment of the –norm — . Negative moments of arbitrary norms were considered in [12], where, in particular, bounds for quantities of the form were derived for less than , the “lower Dvoretzky dimension” of a norm . In [21], negative -th moments of –norms were considered in the same context as our note; however, the relations derived in [21] (see, in particular [21, Lemma 3.6]) do not extend to the case when both and are greater than . Finally, let us mention a recent work [19] where a strong upper bound on was obtained in terms of the positive moment and the variance for any norm in . On the other hand, applying this result of [19] would require extra care because of absence of a truncation level in the statement of [19], and the necessity to have precise lower bounds for . The approach we take here is relatively elementary and based on the Chernoff inequality which we used in Section 2.2.
We start with the following small ball probability estimate:
Lemma 3.1**.**
Let be a large integer, be the standard Gaussian vector, and let and . Then for any number we have
[TABLE]
where are universal constants.
Proof.
Obviously,
[TABLE]
so that for any we have
[TABLE]
First, assume that , where the constant comes from Lemma 2.6. We will divide the above sum into two parts corresponding to large and “intermediate” order statistics. For every , using the notation , we get, in view of Lemmas 2.6 and 2.8,
[TABLE]
Further, for all we have, by Lemmas 2.7 and 2.8,
[TABLE]
Combining the estimates (note that the first term in the first minimum form a geometric sum), we get
[TABLE]
Finally, observe that for , we have that is bounded from above by an absolute constant, so the last estimate is trivially satisfied as long as is chosen sufficiently large. ∎
As a consequence, we obtain
Proposition 3.2**.**
For any there are depending only on with the following property. Let , let and be such that , and let be i.i.d. standard Gaussians. Then for any we have
[TABLE]
Proof.
Fix admissible parameters . We will assume that is large. For any integer , we have
[TABLE]
Applying Lemma 3.1, we obtain for all :
[TABLE]
In the range , we have
[TABLE]
for a sufficiently small universal constant . In particular, for all such the probability in (11) is bounded from above by , where may only depend on . Further, for , we have
[TABLE]
Finally, for the probability in (11) is bounded by
[TABLE]
Combining the estimates, we get for and ,
[TABLE]
Hence,
[TABLE]
and the result follows. ∎
Remark 3.3**.**
Note that for any , we have
[TABLE]
where the symbol “” means that the quantities are equivalent up to a multiple depending only on parameter . To see this, observe that for any we have , whence
[TABLE]
In remains to apply (8) to compare with the power of the second quantile .
4 Upper bounds for the variance
In this section we obtain upper bounds for , . Before we proceed with the proofs, let us provide some motivation for the strategy we have taken. As we mentioned in the introduction, the basic tool for estimating the variance from above is Talagrand’s bound (Theorem 2.1). In [21], the theorem was directly applied to the norm , which gives the estimate
[TABLE]
where denotes the -th partial derivative of the norm (viewed as a function in ) evaluated at . An elementary computation then leads to an equivalent inequality
[TABLE]
where B=1+\log\big{(}\sqrt{{\mathbf{E}}|\partial_{i}\|G\|_{p}|^{2}}/{\mathbf{E}}|\partial_{i}\|G\|_{p}|\big{)}, and so can be at most logarithmic in . A natural approach to estimating the expectation in the last formula would be to remove from the denominator and use independence:
[TABLE]
However, this approach fails for all : the upper bound for the variance we get this way is worse than the bound that follows from –Lipschitzness of –norm. To see this, observe that
[TABLE]
whence, applying standard estimates for absolute moments of Gaussian variables, we get that the expression on the right hand side of (13) is at least of order .
In fact, as we show later, the estimate (13) is not sharp for all . Clearly, the problem with the above argument lies in the fact that, for large , the input of the individual coordinate to the total sum can be huge, and removing the term from the denominator in (12) alters the expectation.
As a way to resolve the issue, we will consider truncated Gaussian variables. Given and a truncation level , we introduce an auxiliary function
[TABLE]
so that
[TABLE]
and then treat the two terms on the r.h.s. separately (the parameter shall always be clear from the context). Determining the right truncation level (when both terms admit satisfactory upper estimates) is not straightforward. We prefer to postpone the actual definition of the truncation level, and consider first some general estimates when is arbitrary and .
We start with .
Lemma 4.1**.**
For any large integer , any and any truncation level we have
[TABLE]
Proof.
Define a random set . Since for any concave function in and any and , we have
[TABLE]
then, taking and , we get
[TABLE]
For every , let be the indicator of the event that exactly coordinates of are greater (in absolute value) than . It follows from the above that for every we have
[TABLE]
where, in view of (17),
[TABLE]
Hence,
[TABLE]
In view of (4), we have , and
[TABLE]
Summarizing, we get
[TABLE]
It is easy to show that for any number we have
[TABLE]
Since in our case , relation (7) implies that , and
[TABLE]
The result follows. ∎
As the next step, we consider the variance of .
Lemma 4.2**.**
Let be a large integer, let and let . Then
[TABLE]
where
[TABLE]
Proof.
It follows from Theorem 2.1 that
[TABLE]
where . First we estimate the numerator. By Proposition 3.2 applied with a constant parameter to the standard –dimensional truncated Gaussian vector, we have
[TABLE]
Next, observe that
[TABLE]
(this can be easily verified using relation (8)). Then, in view of Remark 3.3,
[TABLE]
It remains to estimate from below the denominator in (19). Essentially repeating the above computations, we get
[TABLE]
Further,
[TABLE]
and the statement follows. ∎
For shortness, in what follows we denote
[TABLE]
Note that and by (8)
[TABLE]
Let us state a combination of the last two lemmas as a corollary:
Corollary 4.3**.**
Let be a large integer, let , and let . Then
[TABLE]
where is defined by (18).
Essentially, our work consists in optimizing the above expression over admissible . It turns out that taking the truncation level close to
[TABLE]
produces optimal upper bounds for the variance. Observe that the quantity in (22) is greater than . Indeed,
[TABLE]
Thus, (22) may serve as an admissible truncation level in (21). The following estimates are implied by Corollary 2.4 and relation (7).
Lemma 4.4** (Estimates for ).**
Let be a large integer and let . Then
- •
For , we have
[TABLE]
- •
For , we have
[TABLE]
While working with expression (22) directly may be complicated, the above lemma allows somewhat simpler (equivalent) definition. For , we define a truncation level as follows
[TABLE]
In the next statement we collect some simple properties of .
Lemma 4.5**.**
Provided that is sufficiently large, we have:
- •
* for all ;*
- •
If then ;
- •
If then ;
- •
If then .
Proof.
First, taking into account that
[TABLE]
we get for all . In the range , the assertion trivially follows from (20) and the estimate . For , we have , and as , we get . In the interval the statement follows from the definition of . ∎
Lemma 4.6**.**
We have for all .
Proof.
It is enough to show that
[TABLE]
The derivative of the right hand side with respect to is
[TABLE]
which is less than zero if and only if . Thus, the minimum of on is attained at , and at the point the expression is equal to . ∎
Lemma 4.7** (Estimates for ).**
Let be a large integer, , and let be defined as before. Then
- •
For , we have
[TABLE]
- •
For , we have
[TABLE]
Proof.
For the statement follows directly from the definition of . suppose that . Then , and, applying (7), we get
[TABLE]
We will use the fact that for all . Note that
[TABLE]
Hence, the previous estimate implies
[TABLE]
where in the last relation we used the fact that for any . This proves the lemma. ∎
The last lemma obviously provides upper bounds for the first term in (21) (for ).
Lemma 4.8**.**
Let be a large integer and let . Then
- •
If then
[TABLE]
- •
If then
[TABLE]
- •
If then
[TABLE]
Proof.
First, consider the range . We have , whence, by Corollary 2.4,
[TABLE]
Next, assume that . Here, , and in view of Corollary 2.4,
[TABLE]
Hölder’s inequality then implies
[TABLE]
On the other hand
[TABLE]
whence
[TABLE]
Applying the definition of and the estimates for from Lemma 4.7, we get
[TABLE]
Finally, note that, in the given range for , we have
[TABLE]
Now, we consider the interval . In this range we have , so
[TABLE]
It remains to apply Lemma 4.7. ∎
Lemma 4.9**.**
Let be a large integer and let . Then, with defined by formula (18) with , we have .
Proof.
Since for all , we have in view of Lemma 4.4 and the definition of :
[TABLE]
Further, . This leads to
[TABLE]
If , then , and by Corollary 2.4, we have
[TABLE]
Hence,
[TABLE]
If , then , and by Corollary 2.4 and Lemma 4.8, we get
[TABLE]
In the range under consideration, the minimum of \exp\big{(}-\frac{p}{2e}n^{2/p}\big{)} is attained at , whence , and the statement follows.
Finally, if , then and . Denote . By Hölder’s inequality and Corollary 2.4 we have
[TABLE]
On the other hand, Lemma 4.8 gives
[TABLE]
Together with Lemma 4.7 the estimates imply
[TABLE]
This completes the proof of the lemma. ∎
A combination of Lemmas 4.7, 4.8, and 4.9 with Corollary 4.3 gives
Proposition 4.10**.**
Let be a large integer and let . Then
- •
For we have
[TABLE]
- •
For we have
[TABLE]
- •
For we have
[TABLE]
Proof.
First, assume that . By Corollary 4.3, Lemmas 4.7, 4.8, and 4.9 and Corollary 2.4 we have
[TABLE]
It remains to apply Lemma 4.6 to the first term.
Next, we treat the case . Using the same argument as above and the fact that , we obtain
[TABLE]
Hence,
[TABLE]
Finally, we consider the range . We have, in view of Lemma 4.8, Corollary 2.4 and relation (7):
[TABLE]
Thus,
[TABLE]
∎
Note that in Proposition 4.10 we treat the cae . In the regime , we will rely on the following result from [21]:
Lemma 4.11** ([21, Section 3]).**
We have for all and .
5 Lower bounds for the variance
Let us start with a useful auxiliary result from [21]. We provide a proof for the reader’s convenience.
Lemma 5.1** ([21, Section 3]).**
Let and . Then
[TABLE]
where and are independent standard Gaussian vectors in .
Proof.
First, clearly . Next, it can be checked, using elementary convexity properties, that for any two positive real numbers we have
[TABLE]
Applying the above inequality for and , we obtain
[TABLE]
It is easy to see that, for , the terms in the above sum are equal zero, whence
[TABLE]
The result follows. ∎
As a simple corollary, we obtain the main technical element of the section:
Lemma 5.2**.**
There is a universal constant with the following property. Assume that and . Further, let and be any numbers such that
[TABLE]
where are i.i.d. standard Gaussians. Then for the standard Gaussian vector in we have
[TABLE]
Proof.
In view of Lemma 5.1, we have
[TABLE]
where is an independent copy of . By the assumptions on we have , whence
[TABLE]
Further, observe that for any two numbers and we have , whence, in particular,
[TABLE]
Together with the above inequalities, it gives
[TABLE]
where in the last step we used that, by Corollary 2.4,
[TABLE]
if is big enough. ∎
Naturally, we would like to apply the above lemma with close to where the truncation level was defined in Section 4. We have
Lemma 5.3**.**
Let be a large integer and let be the constant from Lemma 5.2. Then for any we have
[TABLE]
where is defined by formula (23).
Proof.
As it was observed back in Lemma 4.4, we have
[TABLE]
In particular, there is a universal constant such that
[TABLE]
By Markov’s inequality, given i.i.d. Gaussian variables , we have
[TABLE]
Further, we observe that
[TABLE]
whence
[TABLE]
Thus, and satisfy conditions of Lemma 5.2. Applying Lemma 5.2, we obtain
[TABLE]
∎
As a corollary of Lemma 5.3 and bounds on truncated moments from Lemma 4.8, we obtain
Proposition 5.4**.**
Let be a large integer and let , where is defined in Lemma 5.2. Then
- •
For we have
[TABLE]
- •
For we have
[TABLE]
- •
For we have
[TABLE]
Proof.
First, assume that . Then a combination of Lemmas 5.3 and 4.8 and the definition of gives
[TABLE]
Next, assume that . Again, combining Lemmas 5.3 and 4.8, we get
[TABLE]
Finally, consider the case . We have , and
[TABLE]
The statement follows. ∎
Note that in the regime the above estimate gives the right order for only if . For we will use the following estimate from [21]:
Lemma 5.5** ([21]).**
There is a universal constant such that for and all we have
[TABLE]
Together Proposition 4.10, Lemma 4.11, Proposition 5.4 and Lemma 5.5 imply Theorem A from the introduction in the regime . For , we refer to [21].
6 Proof of Corollary B
Let be given. It follows from Theorem A that there exists depending on such that for all sufficiently large , we have
[TABLE]
Let and . Construct an Gaussian matrix whose columns are jointly independent standard Gaussian vectors in . Then a uniform random -dimensional subspace can be defined as
[TABLE]
The subspace is –spherical in for if
[TABLE]
The last inequality holds whenever for all we have
[TABLE]
Let . Note that if then is a standard Gaussian vector and we have
[TABLE]
Let be an -net of minimal cardinality in -metric in . We have and
[TABLE]
Conditioning on the event that \big{|}\|\mathcal{G}y\|_{p}-a\big{|}\leq\varepsilon^{\prime}a/4 for all we have
[TABLE]
(see e.g. Lemma 3.2 of [20]). Thus if there exists such that \big{|}\|\mathcal{G}x\|_{p}-a\big{|}>\varepsilon^{\prime}a/2 then there exists such that and
[TABLE]
This leads to
[TABLE]
To pass from to , note that given and non-negative random variables , , the event that
[TABLE]
is contained inside the event
[TABLE]
By the standard concentration estimates, we have
[TABLE]
hence, taking , , and , we get
[TABLE]
provided that is big enough. Thus (1) is proved.
To prove the second part of Corollary B, it is enough to consider the case . Then , where and are two independent standard Gaussian vectors in , and
[TABLE]
Thus it is enough to show that for we have
[TABLE]
for some depending only on . Observe that standard concentration estimates imply
[TABLE]
whence it is enough to show that
[TABLE]
for some . Recall that for some . Hence,
[TABLE]
Next, observe that , and in view of -symmetry of the –norm and by a result from [29], we have
[TABLE]
This, together with the above relation, implies
[TABLE]
whence
[TABLE]
This, and the fact that with very large probability, implies the statement.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Artstein-Avidan, A. Giannopoulos and V. D. Milman, Asymptotic geometric analysis. Part I, Mathematical Surveys and Monographs, 202, American Mathematical Society, Providence, RI(2015). MR 3331351
- 2[2] S. Boucheron, G. Lugosi and P. Massart, Concentration inequalities , Oxford University Press, Oxford, 2013. MR 3185193
- 3[3] S. Boucheron and M. Thomas, Concentration inequalities for order statistics, Electron. Commun. Probab. 17 (2012), no. 51, 12 pp. MR 2994876
- 4[4] S. Chatterjee, Superconcentration and related topics, Springer Monographs in Mathematics (2013).
- 5[5] H. Chernoff, A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations, Ann. Math. Statistics 23 (1952), 493–507. MR 0057518
- 6[6] D. Cordero-Erausquin and M. Ledoux, Hypercontractive measures, Talagrand’s inequality, and influences, in Geometric aspects of functional analysis , 169–189, Lecture Notes in Math., 2050, Springer, Heidelberg. MR 2985132
- 7[7] H. A. David, Order statistics , second edition, Wiley, New York, 1981. MR 0597893
- 8[8] A. Dvoretzky, Some results on convex bodies and Banach spaces, in Proc. Internat. Sympos. Linear Spaces (Jerusalem, 1960) , 123–160, Jerusalem Academic Press, Jerusalem. MR 0139079
