Estimates of norms of log-concave random matrices with dependent entries
Marta Strzelecka

TL;DR
This paper provides estimates for the expected operator norms of certain log-concave random matrices with dependent entries, extending previous results for Gaussian matrices and achieving near-optimal bounds.
Contribution
It generalizes existing bounds to matrices with dependent log-concave entries and introduces bounds for matrices with Gaussian mixture entries.
Findings
Expected norms are estimated up to logarithmic factors.
Results extend to matrices with dependent entries and Gaussian mixtures.
Bounds are shown to be near-optimal.
Abstract
We prove estimates for for and any random matrix having the entries of the form , where has i.i.d. isotropic log-concave rows. This generalises the result of Gu\'edon, Hinrichs, Litvak, and Prochno for Gaussian matrices with independent entries. Our estimate is optimal up to logarithmic factors. As a byproduct we provide the analogue bound for random matrices, which entries form an unconditional vector in . We also prove bounds for norms of matrices which entries are certain Gaussian mixtures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Estimates of norms of log-concave random matrices with dependent entries
Marta Strzelecka
Institute of Mathematics, University of Warsaw, Banacha 2, 02–097 Warsaw, Poland.
(Date: February 4, 2019)
Abstract.
We prove estimates for for and any random matrix having the entries of the form , where has i.i.d. isotropic log-concave rows. This generalises the result of Guédon, Hinrichs, Litvak, and Prochno for Gaussian matrices with independent entries. Our estimate is optimal up to logarithmic factors. As a byproduct we provide the analogue bound for random matrices, which entries form an unconditional vector in . We also prove bounds for norms of matrices which entries are certain Gaussian mixtures.
Key words and phrases:
Random matrices, operator norm, log-concave vectors, unconditional vectors.
2010 Mathematics Subject Classification:
60B20, 46B09, 15B52
The research was supported by the National Science Centre, Poland via the grants 2015/19/N/ST1/02661 and 2018/28/T/ST1/00001.
1. Introduction and main result
A classical result regarding spectra of random matrices is Wigner’s Semicircle Law, which describes the limit of empirical spectral measures of a random matrix with independent centred entries with equal variance. Theorems of this type say nothing about the largest eigenvalue (i.e. the operator norm). However, Seginer proved in [17] that for a random matrix with i.i.d. symmetric entries (by we denote the operator norm of the matrix from to ) is of the same order as the expectation of the maximum Euclidean norm of rows and columns of . The same holds true for the structured Gaussian matrices (i.e. when and are i.i.d. standard Gaussian variables), as was recently shown by Latała, van Handel, and Youssef in [14], and up to a logarithmic factor for any with independent centred entries, see [16]. The advance of the two latest results is that they do not require that the entries of are equally distributed (nor that they have equal variances).
Another upper bound for also does not require equal distributions but only the independence of entries: by [9] we know that
[TABLE]
This bound is dimension free, but in some cases is worse than the one from [16].
Upper bounds for the expectation of other operator norms were investigated in [2] in the case of independent centred entries bounded by . For and matrices the authors proved that . In [6] Guédon, Hinrichs, Litvak, and Prochno proved that for a structured Gaussian matrix and ,
[TABLE]
This estimate is optimal up to logarithmic factors (see Remark 1.2 below). Note that in the case moment method fails in estimating (as it gives information only on the spectrum of ).
All the mentioned results require the independence of entries of . In this article we will see how to generalise the main result of [6] to a wide class of random matrices with independent uncorrelated log-concave rows, following the scheme of proof of the original theorem from [6]. In order to obtain the key estimates for log-concave vectors needed in the proof we use the comparison of weak and strong moments of -norm of from [11] and a Sudakov minoration-type bound from [10].
Our estimate is optimal (for fixed ) up to a factor depending logarithmically on the dimension. Let us stress that we do not require the rows of to have independent, but only uncorrelated coordinates (and to be log-concave) — we require the independence only between the rows.
Before we state our main results, let us say a few words about log-concave vectors. We say that a random vector in is log-concave, if for any compact nonempty sets and ,
[TABLE]
The class of log-concave vectors is closed under linear transformations, convolutions and weak limits. By the result of Borell [3] an -dimensional vector with a full dimensional support is log-concave if and only if it has a log-concave density, i.e. has a density of the form , where is a convex function with values in .
Log-concave vectors are a natural generalisation of vectors distributed uniformly over convex bodies. Moreover, distribution of any log-concave vector can be obtained as a weak limit of projections of uniform measures over (higher dimensional) convex bodies (see for example [1]). Other results and conjectures about log-concave vectors are discussed in monograph [4].
We say that a vector in is isotropic if . If is a log-concave random vector in with full dimensional support, then there exists a linear transformation such that , so the isotropicity is only a matter of normalisation.
To make the notation more clear, if is an matrix, we denote by its -th row and by we denote its -th column. We are ready now to present the main theorem.
Theorem 1.1**.**
Let , let be i.i.d. isotropic log-concave vectors in , and let be an (deterministic) matrix. Consider a random matrix with entries for , where is the -th coordinate of . Then for every we have
[TABLE]
where depends only on and .
Remark 1.2*.*
Note that the bound from Theorem 1.1 is optimal up to a constant depending on and logarithmically on the dimension. Indeed, since is log-concave we have by the regularity of (see (2.1) below) that \mathbb{E}|Y_{ij}|\geq(2\Cr{seminorms})^{-1}\big{(}\mathbb{E}Y_{ij}^{2}\bigr{)}^{1/2}=(2\Cr{seminorms})^{-1}. Hence for every , (we take , use the unconditionality of and the Jensen inequality)
[TABLE]
Since , we also have for all . Moreover, for all and , (we take and )
[TABLE]
Therefore
[TABLE]
what yields the claim.
The next corollary is a version of Theorem 1.1 in the spirit of the aforementioned results from [17, 14, 16]. It follows directly from (1.3), and the Jensen inequality.
Corollary 1.3**.**
Under the assumptions of Theorem 1.1 we have
[TABLE]
Remark 1.4*.*
If the rows and columns of are isotropic and log-concave (we do not require independence), and , then
[TABLE]
what means that the bound we used in the proof of Corollary 1.3 (the one which uses the Jensen inequality) may be reversed (in the log-concave setting) up to a logarithmic factor and constants depending only on and . Therefore the estimates from Theorem 1.1 and Corollary 1.3 are equivalent up to a logarithmic factor. Inequality (1.2) follows directly from the following proposition.
Proposition 1.5**.**
Let be an random matrix, with isotropic and log-concave rows, let be a deterministic matrix, and let . Then
[TABLE]
It turns out that instead of assuming the log-concavity, we may assume the unconditionality, i.e. that an random matrix we consider, treated as an -dimensional vector, is unconditional (we no longer assume the independence of rows). Recall that we say that a random vector in is unconditional, if for every choice of signs the vectors and are equally distributed (or, equivalently, that and are equally distributed, where are i.i.d. symmetric Bernoulli variables, independent of ). The assertion of the next corollary is expressed in the spirit of Corollary 1.3, which is more natural in the non log-concave setting (without the assumption of log-concavity the assertions of Theorem 1.1 and Corollary 1.3 are no longer equivalent).
Corollary 1.6**.**
Assume that is a random matrix such that the -dimensional vector is unconditional. Then for every we have
[TABLE]
where depends only on and .
The rest of this note is organised as follows. Section 2 contains results from other articles, which will be used in a sequel. Section 3 contains generalisations of Lemmas 3.1 and 3.2 from [6] to the log-concave setting and the proof of Theorem 1.1. In Section 4 we will show how to deduce an analogue of Theorem 1.1 for Gaussian mixtures (see Corollary 4.2) and we will provide a proof of Proposition 1.5. Section 5 is devoted to the proof of Corollary 1.6.
Notation. By we denote universal constants. If a constant depends on a parameter , we express it as . The value of may differ at each occurrence. Whenever we want to fix the value of an absolute constant we use letters . We may always assume that . For two quantities we write if there exists a constant , such that , and , if and . For two numbers and we write instead of .
For a random variable by we denote the -th integral norm of , i.e. the quantity (in the case we also call this quantity the -th strong moment of associated with the norm ). For a vector (in particular for a random vector ) and , by we denote the -norm of , i.e. . For we shall also write instead of . It will be always clear from the context, what means for a random object , so the double meaning of will not lead to any misunderstanding. Recall that for an matrix by we denote its norm from to . For by we denote the Hölder conjugate of , i.e. .
2. Preliminaries
We will frequently use the regularity of for log-concave vectors and seminorms , i.e.
[TABLE]
(see [4, Theorem 2.4.6]).
We will also need the comparison of weak and strong moments for -norms of log-concave vectors:
Theorem 2.1** ([11, Theorem 5]).**
Let be a log-concave vector in , and let . Then
[TABLE]
where
[TABLE]
is the -th weak moment of associated with the -norm.
We will use the previous theorem also in the tail-bound version:
Corollary 2.2**.**
Assume is a log-concave vector in , and . Then
[TABLE]
For the Reader’s convenience we give a proof of this corollary, which goes along the lines of the proof of Corollary 1.3 in [12].
Proof.
Define a random variable . By the Paley–Zygmund inequality and (2.1) we have for , and ,
[TABLE]
In order to show (2.2) we consider 3 cases.
Case 1. . Then by (2.3)
[TABLE]
and (2.2) obviously holds if .
Case 2. . Let us then define
[TABLE]
By (2.3) we have
[TABLE]
By (2.1), Theorem 2.1, and Chebyshev’s inequality we have
[TABLE]
for large enough. Thus (2.2) holds in this case.
Case 3. . Then and (2.2) holds for any . ∎
In the proof of Theorem 1.1 we will use Theorem 2.1 from [6], which is another version of results provided before by Guédon–Rudelson in [8], and by Guédon–Mendelson–Pajor–Tomczak-Jaegerman in [7]:
Theorem 2.3** ([6, Theorem 2.1]).**
Let be a Banach space with modulus of convexity of power type with constant . Let be independent random vectors, and let . Define
[TABLE]
and
[TABLE]
where is the Rademacher type constant of . Then
[TABLE]
We will use Theorem 2.3 with . In this case and are known.
3. Proof of Theorem 1.1
The next two lemmas provide estimates of the quantities and appearing in Theorem 2.3 in the case .
Lemma 3.1**.**
Assume , and are as in Theorem 1.1. Then
[TABLE]
where depends only on and .
Lemma 3.2**.**
Assume , and are as in Theorem 1.1. Then
[TABLE]
In the proof of Lemma 3.1 we will also need the following estimate:
Lemma 3.3**.**
Assume that is an isotropic log-concave vector in . Then for all and all we have
[TABLE]
where denotes the non-increasing rearrangement of .
In order to prove Theorem 1.1, we repeat the proof scheme from [6].
Proof of Theorem 1.1.
We use Theorem 2.3 for . Then (see [15, Theorem 5.3]) and . Let and be given by formulas (2.4) and (2.5). The triangle inequality, Theorem 2.3, Lemma 3.1, and Lemma 3.2 yield
[TABLE]
∎
The main contribution of this article lies in the proofs of Lemmas 3.1, 3.2, and 3.3.
Proof of Lemma 3.3.
We may and do assume that , i.e. for . By [10, Proposition 3.3] we have for all ,
[TABLE]
Thus
[TABLE]
Proof of Lemma 3.1.
We may and do assume that .
Since we may approximate by nonzero numbers, we may and do assume that for all . Let be the constants from (2.2), let be the constant from Lemma 3.3, and recall that is the constant from (2.1). We may assume that all these constants are greater than .
Note that for any we have . Thus, by the triangle inequality,
[TABLE]
Moreover, for every we have by (2.1) and the isotropicity of , that
[TABLE]
Now we pass to the estimation of the fist term of (3.2). Let
[TABLE]
By (2.2) we have
[TABLE]
For the function we integrate vanishes, so from now on we will consider only ’s for which .
Note that if and , then
[TABLE]
and . Therefore
[TABLE]
Now we will estimate from below. For let
[TABLE]
Since are identically distributed, does not depend on . By (2.1), and the isotropicity of we have
[TABLE]
Since we can permute the rows of , we may and do assume that
[TABLE]
Let be such an index that . Lemma 3.3 applied to and the non-increasing sequence implies
[TABLE]
so for all we have
[TABLE]
Note that by (2.1) for all we have . Take . Then by a calculation similar to the one above we get
[TABLE]
so indeed .
Therefore for all we have
[TABLE]
Since the function is strictly increasing, the previous inequality yields . This together with (3.5) implies that (recall that )
[TABLE]
Inequalities (3), (3.8), and the Stirling formula yield that
[TABLE]
Moreover, by (2.1)
[TABLE]
where the second inequality holds since the weak first moment is bounded above by the strong first moment. This together with (3.2), (3), and (3.9) gives the assertion. ∎
Proof of Lemma 3.2.
Note that if , then for every we have , so we may and do assume . By (2.1), the isotropicity of , and the Jensen inequality we have
[TABLE]
Remark 3.4*.*
By the same reasoning as in the log-concave case, we may prove (using [12, Corollary 1.3], [13, Theorem 2.1], and the claim below instead of (2.2), Lemma 3.3 and the previous estimates on , respectively) the following.
Let be an random matrix with entries , where are independent symmetric random variables such that . Assume that for any and any , we have with . Then for every we have
[TABLE]
where depends only on , , and . At the end of Section 4 we provide another result concerning this type of random matrices (see Corollary 4.5).
As we mentioned, it suffices to prove the claim:
[TABLE]
where is an absolute constant, and repeat the proof of Theorem 1.1.
Proof of the claim.
It suffices to consider , where is an integer. Let us denote
[TABLE]
Let be the standard -dimensional Gaussian vector. Recall that for any and we have .
By the assumptions on and by the fact that we get
[TABLE]
what finishes the proof of (3.10). ∎
By the claim we get
[TABLE]
what allows us to obtain a version of (3) for
4. Estimates of norms of matrices in the case of Gaussian mixtures
Let us recall the definition from [5], where the significance of Gaussian mixtures is also described.
Definition 4.1*.*
A random variable is called a (centred) Gaussian mixture if there exists a positive random variable and a standard Gaussian random variable , independent of , such that has the same distribution as .
We will work with matrices of the form which entries are Gaussian mixtures. We additionally assume that , where , and that the matrix is log-concave and isotropic (considered as a random vector in ). It will be clear from the proof, that the corollary below is true also for another type of matrices: , where , and is an arbitrary isotropic log-concave random vector.
Corollary 4.2**.**
Let , let , let be a deterministic matrix, and let be a random matrix which entries are i.i.d. standard Gaussian variables. Let , where is a log-concave and isotropic random matrix independent of . Then for every we have
[TABLE]
Proof.
Theorem 1.1 applied to and yields
[TABLE]
so it suffices to prove that
[TABLE]
and
[TABLE]
for . By the symmetry of assumptions we need only to show (4.1).
If , then
[TABLE]
and
[TABLE]
so it suffices to consider only (we used here the assumption that ).
Note that for any we have
[TABLE]
Fix . By Theorem 2.1 applied to , (recall that , so ), and we have
[TABLE]
Let us use (2.1) and the assumption to estimate the first term in (4):
[TABLE]
Recall that . We use again (2.1) and the isotropicity of to estimate the second term in (4):
[TABLE]
Take and put together (4), (4), and (4.4) to get the assertion. ∎
Remark 4.3*.*
Using [6, Theorem 1.1] instead of Theorem 1.1 in the proof above yields a slightly better estimate:
[TABLE]
Remark 4.4*.*
It is clear from the proof of Corollary 4.2 that in the case , where are i.i.d. standard Gaussian variables, inequality (4.1) may be slightly improved:
[TABLE]
In order to obtain this improvement one should use instead of . Therefore, if we additionally use Remark 4.3, the assertion of Corollary 4.2 in the case (where is independent of ) will state that
[TABLE]
Proof of Proposition 1.5.
We begin similarly as in the proof of (4.1) (in the case ), but we estimate the second term on the right-hand side of (4) in a slightly different way, using (2.1):
[TABLE]
We take to get the assertion. ∎
We may use the result concerning Gaussian mixtures to obtain the estimate similar to the one from Remark 3.4, valid for all (not only for ), but with a slightly worse constants than in Remark 3.4. The proof is based on the fact, that variables satisfying the moment assumption from Remark 3.4 are comparable with a certain Gaussian mixtures.
Corollary 4.5**.**
Let , , and let be an random matrix with entries , where are independent symmetric random variables such that . Assume that for any and any , we have . Then for all ,
[TABLE]
Proof.
Let , , , be i.i.d. standard Gaussian variables. Let be i.i.d. symmetric Bernoulli random variable, independent of and . Note that satisfies for all , with a universal constant , since for . Let be the random matrix with entries . By [14, Lemma 4.7] we know that
[TABLE]
for any norm on real matrices. In particular
[TABLE]
Moreover, by the Jensen inequality and by (4.7) applied with we have
[TABLE]
what yields the assertion, since . ∎
5. The case of unconditional entries
Proof of Corollary 1.6.
Since is unconditional, it has the same distribution as the matrix , where are i.i.d. symmetric Bernoulli variables independent of . Let be i.i.d. standard Gaussian variables independent of and . Then
[TABLE]
where in the last step we used Corollary 1.3 to estimate the mean with respect to . We use (4.6) with (to in each term above separately) to get the assertion. ∎
Remark 5.1*.*
Using [6, Theorem 1.1] instead of Theorem 1.1 in the proof above yields a slightly better estimate in Theorem 1.6:
[TABLE]
6. Acknowledgements
I would like to thank Rafał Latała for suggestions which helped me to make the presentation clearer and more reader-friendly.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Artstein-Avidan, B. Klartag, and V. Milman, The Santaló point of a function, and a functional form of the Santaló inequality , Mathematika 51 (2004), no. 1-2, 33–48 (2005). MR 2220210
- 2[2] G. Bennett, V. Goodman, and C. M. Newman, Norms of random matrices , Pacific J. Math. 59 (1975), no. 2, 359–365. MR 0393085
- 3[3] C. Borell, Convex measures on locally convex spaces , Ark. Mat. 12 (1974), 239–252. MR 0388475
- 4[4] S. Brazitikos, A. Giannopoulos, P. Valettas, and B.H. Vritsiou, Geometry of isotropic convex bodies , Mathematical Surveys and Monographs, vol. 196, American Mathematical Society, Providence, RI, 2014. MR 3185453
- 5[5] A. Eskenazis, P. Nayar, and T. Tkocz, Gaussian mixtures: entropy and geometric inequalities , Ann. Probab. 46 (2018), no. 5, 2908–2945. MR 3846841
- 6[6] O. Guédon, A. Hinrichs, A.E. Litvak, and J. Prochno, On the expectation of operator norms of random matrices , Geometric aspects of functional analysis, Lecture Notes in Math., vol. 2169, Springer, Cham, 2017, pp. 151–162. MR 3645120
- 7[7] O. Guédon, S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann, Majorizing measures and proportional subsets of bounded orthonormal systems , Rev. Mat. Iberoam. 24 (2008), no. 3, 1075–1095. MR 2490210
- 8[8] O. Guédon and M. Rudelson, L p subscript 𝐿 𝑝 L_{p} -moments of random vectors via majorizing measures , Adv. Math. 208 (2007), no. 2, 798–823. MR 2304336
