Optimized Tail Bounds for Random Matrix Series

Xianjie Gao; Mingliang Zhang; Jinming Luo

PMC · DOI:10.3390/e26080633·July 26, 2024

Optimized Tail Bounds for Random Matrix Series

Xianjie Gao, Mingliang Zhang, Jinming Luo

PDF

Open Access

TL;DR

This paper improves tail bounds for random matrix series by using intrinsic dimension, making them more applicable in high-dimensional settings.

Contribution

The novelty lies in using intrinsic dimension instead of ambient dimension for tail bounds in random matrix series.

Findings

01

Modified tail bounds for matrix Gaussian and sub-Gaussian series are derived using intrinsic dimension.

02

Expectation bounds for random matrix series are obtained based on intrinsic dimension.

03

The approach is suitable for high-dimensional or infinite-dimensional settings.

Abstract

Random matrix series are a significant component of random matrix theory, offering rich theoretical content and broad application prospects. In this paper, we propose modified versions of tail bounds for random matrix series, including matrix Gaussian (or Rademacher) and sub-Gaussian and infinitely divisible (i.d.) series. Unlike present studies, our results depend on the intrinsic dimension instead of ambient dimension. In some cases, the intrinsic dimension is much smaller than ambient dimension, which makes the modified versions suitable for high-dimensional or infinite-dimensional setting possible. In addition, we obtain the expectation bounds for random matrix series based on the intrinsic dimension.

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases1

injury to people or property

Funding2

—National Natural Science Foundation of China
—Shanxi Provincial Research Foundation for Basic Research, China

Keywords

tail boundintrinsic dimensionrandom matrix seriesexpectation bound

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRandom Matrices and Applications · Advanced Combinatorial Mathematics · Advanced Algebra and Geometry

Full text

1. Introduction

Random matrix theory is a significant branch of mathematics, which delves into the properties and behavior of random matrices. Its applications span across various fields, including wireless communications [1], combinatorial optimization [2], matrix low-rank approximation [3], neural networks [4,5], and deep learning [6]. Random matrices have a wide range of applications in physics, entropy, and information science. They can provide comprehensive descriptions and analyses when dealing with multiple interacting elements, high-dimensional systems, and complex statistical relationships. Random matrices can better capture the complex interactions between multiple particles or multiple physical processes. When dealing with high-dimensional physical systems with a large number of degrees of freedom, random matrices provide a more natural and effective representation [7,8]. Random matrices can be used to calculate the entropy of complex systems to measure the degree of chaos and uncertainty of the system [9,10]. In the field of information science, random matrices can be used for performance optimization and signal processing of communication systems [11]. Random matrix theory provides a powerful theoretical basis for dealing with problems in these fields. Among them, the random matrix series is an important research topic in the field of random matrix theory, and has wide application and research value.

The study of random matrix theory comprises two branches: asymptotic theory and non-asymptotic theory. There have been several notable asymptotic results in random matrix theory, including Wigner’s semicircle law [12], the Marchenko–Pastur law [13], and the Bai–Yin law [14]. While these asymptotic statements can offer precise limiting results as the matrix dimension approaches infinity, they do not specify the rate at which these probability terms converge to their limits. In response to this challenge, non-asymptotic approaches to analyzing these probability terms have emerged.

Ahlswede and Winter [15] illustrated the application of the Golden–Thompson inequality [16,17] in extending the Laplace transform method to the matrix scenario to derive tail bounds for sums of random matrices. Tropp [18] utilized a corollary of Lieb’s theorem [19] to achieve a significant improvement over the Ahlswede–Winter outcome. To address the notable limitation of results being dependent on the intrinsic dimensions of the matrix, where the bounds become excessively loose in scenarios involving high-dimensional matrices, Hsu et al. [20] presented a tighter analogy to matrix Bernstein’s inequality. Minsker [21] extended Bernstein’s concentration inequalities for random matrices by enhancing the results in [20] through the introduction of the concept of effective rank. Zhang et al. [22] introduced dimension-free tail bounds for the largest singular value of sums of random matrices.

The matrix series form $[eqn]$ has played a crucial role in recent studies [23,24,25], where $[eqn]$ represents a random variable and $[eqn]$ is a fixed matrix. The variable $[eqn]$ can encompass various types of random variables, including Gaussian, Bernoulli, infinitely divisible random variables, and more. Tropp [18] utilized Gaussian series to study the key characteristics of matrix tail bounds. Zhang et al. [26] studied the tail inequalities of the largest eigenvalue of a matrix infinitely divisible (i.d.) series and applied them to optimization problems and compressed sensing.

1.1. Related Works

Consider the sum $[eqn]$ , where $[eqn]$ are real numbers and $[eqn]$ are independent standard Gaussian variables. There is the probability inequality

[eqn]

Let $[eqn]$ be a finite sequence of fixed Hermitian matrices with dimension d. Tropp [18] gave the following result for any $[eqn]$ :

[eqn]

A significant distinction between (1) and (2) is the presence of the matrix dimension factor d in the latter. Hsu et al. [20] obtained the following tail bound:

[eqn]

We observe that the right side of the inequality is the product of two terms. When both terms are smaller, the result will be tighter. Compared with (2), we know that $[eqn]$ but $[eqn]$ for $[eqn]$ . That is, one term of the results (2) and (3) becomes smaller, while the other term becomes larger. In other words, both outcomes have their respective limitations.

Let $[eqn]$ be a finite sequence of independent sub-Gaussian random variables. The tail bound can be obtained from

[eqn]

where c is an absolute constant.

Let $[eqn]$ be a finite sequence of independent infinitely divisible random variables. Let $[eqn]$ be fixed d-dimensional Hermitian matrices with $[eqn]$ , $[eqn]$ . For any $[eqn]$ , Zhang et al. [26] deduced the following results:

[eqn]

where $[eqn]$ and $[eqn]$ is the inverse of $[eqn]$ . For any $[eqn]$ ,

[eqn]

where $[eqn]$ .

In addition, Tropp [27] gave the expectation bound for the matrix Gaussian series,

[eqn]

and Zhang et al. [26] also proposed an expectation bound for infinitely divisible matrix series under some given conditions.

However, the significant drawback of the above results lies in the reliance on the ambient dimension of the matrix. The bounds tend to very loose when the matrices have a high dimension. To solve this problem, we optimize the existing theory. Tighter tail bounds for random matrices mean more precise and reliable probability estimates, which enables people to have a more accurate grasp of the behavior of random matrices and helps to improve the accuracy, efficiency and reliability of theory and application.

1.2. Overview of Main Results

With the aim of enhancing the limitations of the existing theory and to complement and refine the existing random matrix theory, we put forward optimized tail and expectation bounds for random matrix series in this paper, including matrix Gaussian (or Rademacher), sub-Gaussian, and infinitely divisible (i.d.) series. This makes the modified version potentially adaptable to high-dimensional or infinite-dimensional matrix settings. Taking the matrix Gaussian series as an example, we obtain the tighter conclusion:

[eqn]

and

[eqn]

The $[eqn]$ and $[eqn]$ will be introduced in detail later in the paper.

The rest of this paper is organized as follows. Section 2 introduces some preliminary knowledge on the intrinsic dimension and Gaussian (or Rademacher), sub-Gaussian, and infinitely divisible (i.d.) distributions. Section 3 gives tail and expectation bounds based on the intrinsic dimension bounds for Gaussian (or Rademacher), sub-Gaussian, and infinitely divisible (i.d.) matrix series. The last section concludes the paper.

2. Notations and Preliminaries

In this section, some preliminary knowledge will be provided about the intrinsic dimension of the matrix, and also about Gaussian (or Rademacher), sub-Gaussian, infinitely divisible distributions, and matrix series.

2.1. The Intrinsic Dimension

Existing tail bounds on random matrix series depend on the ambient dimension of the matrix. We introduce the concept of the intrinsic dimension, which is much smaller than the ambient dimension in some cases (see also [27]).

Definition 1. For a positive-semidefinite matrix $[eqn]$ , the intrinsic dimension is defined as

[eqn]

It can be seen from the definition that the intrinsic dimensions do not significantly affected by changes in the size of the matrix. Actually, when the eigenvalues of $[eqn]$ decrease very powerfully, the intrinsic dimension is much smaller than the ambient dimension.

2.2. Several Distributions

In this section, we briefly introduce three random distributions and their moment generating functions, including Gaussian (or Rademacher), sub-Gaussian, and infinitely divisible (i.d.) distributions.

The Gaussian distribution is a very important continuous distribution in probability theory and statistics, and is often used to represent real-valued random variables with unknown distribution. Given a Gaussian variable $[eqn]$ , the moment generating function (mgf) is given by

[eqn]

The Rademacher distribution is a discrete probability distribution in which the random variable takes on the value of 1 or $[eqn]$ with probability $[eqn]$ . Given a Rademacher variable $[eqn]$ , the moment generating function is given by

[eqn]

The sub-Gaussian distribution has strong tail decay, including many distributions, such as uniform and all bounded random distributions. Given a central sub-Gaussian random variable $[eqn]$ , it holds that

[eqn]

where c is an absolute constant.

Infinitely divisible (i.d.) distributions are referring to a large class of probability distributions that play an important role in probability theory with limit theorems. A random variable $[eqn]$ has an i.d. distribution if, for any $[eqn]$ , there exists independent and identically distributed (i.i.d.) random variables $[eqn]$ , such that $[eqn]$ has the same distribution as $[eqn]$ .

In discrete distributions, infinitely divisible distributions include Poisson distribution, negative binomial distribution, and geometric distribution. Among the continuous distributions, Cauchy distribution, Lévy distribution, stable distribution and Gamma distribution are examples of infinitely divisible distributions.

A real-valued random variable $[eqn]$ is i.d. if and only if there exists a triplet $[eqn]$ , where the characteristic function of $[eqn]$ is defined by

[eqn]

where $[eqn]$ , $[eqn]$ and $[eqn]$ is a Lévy measure. This necessary and sufficient condition is Lévy–Khintchine Theorem.

Let $[eqn]$ be an i.d. random variable with the triplet $[eqn]$ , and suppose that $[eqn]$ . Let $[eqn]$ . For any $[eqn]$ and $[eqn]$ ,

[eqn]

The proof can be referred to in [26].

2.3. Random Matrix Series

Given n fixed matrices $[eqn]$ , a random matrix series is represented as $[eqn]$ , where $[eqn]$ are independent variables. The tail and expectation bounds for random matrix series can be bounded to $[eqn]$ and $[eqn]$

3. Intrinsic Dimension Bounds for Matrix Series

In this section, we present tail bounds for random matrix series based on intrinsic dimension bounds, and also obtain the expectation bounds.

3.1. Matrix Gaussian (or Rademacher) Series with Intrinsic Dimension

This section presents the tail and expectation bounds for matrix Gaussian (or Rademacher) series with an intrinsic dimension.

Theorem 1. Consider a finite sequence $[eqn]$ of fixed Hermitian matrices with the same dimensional d, with $[eqn]$ being a finite sequence of independent Gaussian (or Rademacher) variables. Introduce the matrix $[eqn]$ . Define the following parameters:

[eqn]

Then, it holds that

[eqn]

Compared with the previous results in (2) and (3), our result in (15) improves upon their respective shortcomings, and is more tight. Therefore, our bound is more applicable for the case of high-dimensional matrices.

Theorem 2. Given a matrix Gaussian (or Rademacher) series $[eqn]$ , then it holds that

[eqn]

Compared with the previous result in (7), our result in (16) depends on the intrinsic dimensions of the matrix, and is more applicable for the case of high-dimensional matrices.

The proofs of Theorems 1 and 2 are similar to the proofs of sub-Gaussian matrix series; we omit them here.

3.2. Matrix Sub-Gaussian Series with Intrinsic Dimension

This section presents the tail and expectation bounds for matrix sub-Gaussian series with an intrinsic dimension.

Theorem 3. Consider a finite sequence $[eqn]$ of fixed Hermitian matrices with the same dimensional d, with $[eqn]$ being a finite sequence of independent central sub-Gaussian variables. Introduce the matrix $[eqn]$ . Define the following parameters:

[eqn]

Then, it holds that

[eqn]

where c is an absolute constant.

Before proving this theorem, we first introduce a proposition [27] that will be used in the proof process. This proposition is a key step in our proof.

Proposition 1. Let $[eqn]$ be a random Hermitian matrix. Let $[eqn]$ be a nonnegative function that is nondecreasing on $[eqn]$ . For each $[eqn]$ ,

[eqn]

Proof. Let the sum

[eqn]

Fix a number $[eqn]$ , and define the function $[eqn]$ for $[eqn]$ . For $[eqn]$ , Proposition 1 states that

[eqn]

Introduce the matrix $[eqn]$ . According to the mgf of a sub-Gaussian random variable in (12) and the transfer rule (consider a real-valued function f, if $[eqn]$ for $[eqn]$ , then $[eqn]$ , when the eigenvalues of $[eqn]$ lie in I), it can be known that

[eqn]

Introduce the function $[eqn]$ , and observe that

[eqn]

Define the following parameters:

[eqn]

We have

[eqn]

Next, combine the bound (21) and the probability bound to obtain

[eqn]

We use the following formula to control the fraction:

[eqn]

We select $[eqn]$ to obtain

[eqn]

Install the assumption that $[eqn]$ and yield the conclusion. □

Since the large deviation inequality considers the case where t is large, the limitation $[eqn]$ is reasonable.

Theorem 4. Given a matrix sub-Gaussian series $[eqn]$ , then it holds that

[eqn]

Proof. Fix a number $[eqn]$ .

[eqn]

Select $[eqn]$ ,

[eqn]

□

3.3. Matrix Infinite Divisible Series with Intrinsic Dimension

This section presents the tail and expectation bounds for matrix i.d. series with an intrinsic dimension.

Theorem 5. Consider a finite sequence $[eqn]$ of fixed Hermitian matrices with the same dimensional d, $[eqn]$ , with $[eqn]$ being a finite sequence of independent centered i.d. with the triplet $[eqn]$ variable, such that $[eqn]$ for some $[eqn]$ . Introduce the matrix $[eqn]$ . Define the following parameters:

[eqn]

Then, holds that

[eqn]

where $[eqn]$ is the left limit at M, with

[eqn]

and $[eqn]$ is the inverse of

[eqn]

For any $[eqn]$ , we have

[eqn]

where

[eqn]

Compared with the previous result in [26], our results depend on the intrinsic dimensions of the matrix and are more applicable for the case of high-dimensional matrices.

Proof. Let the sum

[eqn]

Introduce the matrix $[eqn]$ . Similar to the above proof, according to the mgf of i.d. random variable in (14) and the transfer rule, we can obtain

[eqn]

Next, we minimize the right-hand side of (29) with respect to $[eqn]$ . Since $[eqn]$ for all $[eqn]$ , $[eqn]$ is infinitely differentiable on $[eqn]$ , with

[eqn]

and

[eqn]

Since $[eqn]$ , we have

[eqn]

We select $[eqn]$ to obtain

[eqn]

Install the assumption that $[eqn]$ ; we have

[eqn]

Actually, when $[eqn]$ , according to the convexity of $[eqn]$ with respect to $[eqn]$ and the monotonicity of $[eqn]$ ( $[eqn]$ ), the solution to the optimization problem is $[eqn]$ . Thus, for any $[eqn]$ , we have

[eqn]

□

Given some specific settings of the measure $[eqn]$ , we can obtain the following corollary.

Corollary 1. Assume ν has a bounded support, i.e., there exists a positive constant $[eqn]$ such that $[eqn]$ and $[eqn]$ . Let

[eqn]

It follows that $[eqn]$ . Then, for any $[eqn]$ ,

[eqn]

where $[eqn]$ , and

[eqn]

Proof. Since the support is $[eqn]$ , it holds that $[eqn]$ for any $[eqn]$ . Thus, we have

[eqn]

Denote $[eqn]$ with the inverse function $[eqn]$ $[eqn]$ . Since $[eqn]$ and $[eqn]$ ( $[eqn]$ ) are strictly increasing functions, their inverse functions satisfy the relation $[eqn]$ for all $[eqn]$ . By combining (27) and (36), we obtain, for any $[eqn]$ ,

[eqn]

where $[eqn]$ . This completes the proof. □

Given a matrix i.d. series $[eqn]$ , then it holds that

[eqn]

In other words, in the case where the tail bound is integrable, we can use Formula (37) to obtain the expectation bound based on the intrinsic dimensions for matrix i.d. series.

Compared with existing studies, our results are based on the intrinsic dimension of the matrix. The tail and expectation bounds are tighter than the previous results. Therefore, our bounds are more applicable for the case of high-dimensional matrices.

In addition, by using the Hermitian dilation, our results can also be extended to the scenario of non-Hermitian random matrix series. Consider that the general random matrix series $[eqn]$ , $[eqn]$ is established, among which

[eqn]

Thus, we may invoke each theorem to obtain tail and expectation bounds for the norm of the random matrix series.

4. Conclusions

In this paper, we propose optimized tail and expectation bounds for random matrix series, including matrix Gaussian (or Rademacher) and sub-Gaussian and infinitely divisible (i.d.) series. Different from existing studies, our results depend on intrinsic dimension rather than ambient dimension, and are more suitable for the case of high-dimensional matrices.

In future work, we will use the obtained results to study tail bounds and expectation bounds for other eigenvalues of random matrix series.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Tulino A.M. VerdúS. Random matrix theory and wireless communications Found. Trends Commun.20041118210.1561/0100000001 · doi ↗
2Naor A. Regev O. Vidick T. Efficient rounding for the noncommutative grothendieck inequality Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing Palo Alto, CA, USA 1–4 June 20137180
3Gittens A. Mahoney M.W. Revisiting the nyström method for improved large-scale machine learning J. Mach. Learn. Res.20161739774041
4Louart C. Liao Z. Couillet R. A random matrix approach to neural networks Ann. Appl. Probab.2018281190124810.1214/17-AAP 1328 · doi ↗
5Wang Z. Zhu Y. Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks Ann. Appl. Probab.2024341896194710.1214/23-AAP 2010 · doi ↗
6Martin C.H. Mahoney M.W. Implicit self-regularization in deep neural networks: Evidence from random matrix theory and implications for learning J. Mach. Learn. Res.202122173
7Wigner E.P. Random matrices in physics SIAM Rev.1967912310.1137/1009001 · doi ↗
8Guhr T. Müller-Groeling A. Weidenmüller H.A. Random-matrix theories in quantum physics: Common concepts Phys. Rep.199829918942510.1016/S 0370-1573(97)00088-4 · doi ↗