Computation of the Expected Euler Characteristic for the Largest Eigenvalue of a Real Non-central Wishart Matrix
Nobuki Takayama, Lin Jiu, Satoshi Kuriki, and Yi Zhang

TL;DR
This paper develops an approximate formula for the distribution of the largest eigenvalue of real Wishart matrices using the expected Euler characteristic method, including differential equations and numerical analysis for small dimensions.
Contribution
It introduces a new approximation method for eigenvalue distributions of Wishart matrices applicable to general dimensions, with detailed analysis for 2x2 cases.
Findings
Derived an approximate distribution formula for the largest eigenvalue.
Established a differential equation for the 2x2 case.
Performed numerical analysis validating the approximation.
Abstract
We give an approximate formula for the distribution of the largest eigenvalue of real Wishart matrices by the expected Euler characteristic method for the general dimension. The formula is expressed in terms of a definite integral with parameters. We derive a differential equation satisfied by the integral for the matrix case and perform a numerical analysis of it.
| 0 | 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|---|
| 0.74 | 0.56 | 0.14 | 0.014 | 0.00058 | ||
| 1. | 0.95 | 0.57 | 0.14 | 0.014 | 0.00058 |
| # pars | ||||||
|---|---|---|---|---|---|---|
| Chyzak | - | - | - | - | ||
| Heuristic | - |
| HGM | ||||||
|---|---|---|---|---|---|---|
| mc |
| simulation | ||
|---|---|---|
| 3.8133 | 0.051146 | 0.051176 |
| 3.8166 | 0.047517 | 0.047695 |
| 3.82 | 0.044120 | 0.044515 |
| simulation (with 100000 tries) | ||
|---|---|---|
| 3 | 0.215428520 | 0.217072 |
| 4 | 0.016122970 | 0.016195 |
| 5 | 0.000357368 | 0.000386 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Point processes and geometric inequalities · Morphological variations and asymmetry
Computation of the expected Euler characteristic for the largest eigenvalue of a real non-central Wishart matrix
Nobuki Takayama
Lin Jiu
Satoshi Kuriki
Yi Zhang
Department of Mathematics, Kobe University, Japan
Department of Mathematics and Statistics, Dalhousie University, Canada
The Institute of Statistical Mathematics, Research Organization of Information and Systems, Japan
Department of Mathematical Sciences, The University of Texas at Dallas, USA
Department of Mathematical Sciences, Xi’an Jiaotong-Liverpool University, China
Abstract
We give an approximate formula for the distribution of the largest eigenvalue of real Wishart matrices by the expected Euler characteristic method for general dimension. The formula is expressed in terms of a definite integral with parameters. We derive a differential equation satisfied by the integral for the matrix case and perform a numerical analysis of it.
keywords:
Euler characteristic method, holonomic gradient method, real non-central Wishart distributions.
MSC:
[2010] 62H10 , 68W30
††journal: Journal of Multivariate Analysis
1 Introduction
For , let be independently distributed as the -dimensional (real) Gaussian distribution , where and are the mean vector and covariance matrix of , respectively. The (real) Wishart distribution is the probability measure on the cone of positive semi-definite matrices induced by the random matrix
[TABLE]
Here, is a parameter matrix. Unless vanishes, the corresponding distribution is referred to as the non-central (real) Wishart distribution.
The largest eigenvalue of is used as a test statistic for testing and/or under the assumption that is positive semi-definite. This test statistic is expected to have good power when the matrices and are of small size.
When testing hypotheses, the distribution of , which is the largest eigenvalue of , is of particular interest as it gives the power of the test. When , the works by James and other authors (see, e.g., Muirhead [31]) show that the cumulative distribution function of can be written as a hypergeometric function with matrix argument as follows:
[TABLE]
where is a known constant [31, Corollary 9.7.2]. It is well known that the hypergeometric function has a series expression in the zonal polynomial with index , which is a partition of an integer. However, in view of numerical calculation, this is less useful because the explicit form of is not known unless the rank of the matrix is 1 or 2. On account of this difficulty, Hashiguchi et al. [14] proposed a holonomic gradient method (HGM) for numerical evaluation, which utilizes a holonomic system of differential equations for computation. However, when , the situation is more difficult. In this case, the cumulative distribution function cannot be expressed as a simple series of zonal polynomials. Hayakawa [15, Corollary 10] provides a formula for the cumulative distribution function as a series expansion in the Hermite polynomial with symmetric matrix argument defined by the Laplace transform of as follows:
[TABLE]
The Hermite polynomial can be written as a linear combination of the zonal polynomial ; however, the coefficients not provided explicitly [7]. Another approach is to use invariant polynomials proposed by Davis [11, 12]. Using the probability density function of the non-central Wishart distribution derived by James [19], the cumulative distribution function of is shown to be proportional to
[TABLE]
Díaz-Garí and Gutiérrez-Jáimez [13] showed that this has a series expansion in terms of invariant polynomials. Here, the invariant polynomial is a polynomial in two matrices indexed by two partitions. Although, in principle, the invariant polynomial can be expressed in terms of zonal polynomials in two matrices, it is challenging to utilize this expression for numerical calculation.
In this paper, instead of the direct calculation approach, we approximate the distribution function through the expected Euler characteristic heuristic or the Euler characteristic method proposed by Adler [1] and Worsley [36]. (see also [2] and [29].) This is a methodology to approximate the tail upper probability of a random field. In our problem, since the square root of the largest eigenvalue is the maximum of a Gaussian field
[TABLE]
this method is applicable ( [27], [28]). One can show that the Euler characteristic method evaluates the quantity
[TABLE]
rather than . Nevertheless, this formula approximates well when is large because () are negligible for large . This is practically sufficient for our purpose because only the upper tail probability is required in testing hypotheses.
In this paper, we consider the non-central real Wishart matrix. In the multiple-input multiple-output (MIMO) problem, the non-central complex Wishart matrix also plays an important role. The largest eigenvalue of the non-central complex Wishart matrix is significantly easier to handle in that case because the explicit formula for the cumulative distribution was given by Kang and Alouini [21]. The holonomic gradient method based on Kang and Alouini’s formula was proposed in [10].
In general, the approximation error of the Euler characteristic has not been extensively studied. Nevertheless, the Euler characteristic heuristic is widely used as an approximation of the tail probability of the supremum because of the difficulty of the original problem and the empirical usefulness of this heuristic (see, e.g., [35]). One exception is the Gaussian process with mean zero and variance one, which corresponds to the central Wishart case where is proportional to the identity matrix and vanishes. In this particular case, the approximation error has been fully investigated [27, 28]; the details are presented in B. For the non-central case, we present the following lemma for which the proof is provided in A:
Lemma 1**.**
Assume that . If either or has distinct eigenvalues, then it holds that
[TABLE]
This implies that the Euler characteristic approximation (1) is justified as an approximation for . We conjecture that this holds for arbitrary and arbitrary configurations of and .
The rest of the paper is organized as follows. In Section 2, we provide an integral representation formula for the expectation of the Euler characteristic for random matrices of a general size. In Section 3, we consider the case of random matrices and study the integral representation derived in Section 2 in the polar coordinate system and investigate it from a numerical point of view. By virtue of the theory of holonomic systems (see, e.g., [16]), the integral representation given in Section 2 satisfies a holonomic system of linear differential equations. However, its explicit form is not known in general. In Section 4, we consider the case of random matrices again to demonstrate that the recent development [22, 25] of computer aided proofs and derivations (CAPD), for combinatorial identities, proofs of them, derivations of difference, and differential equations can be applied to the evaluation of definite integrals or sums. We derive a differential equation which is satisfied by the integral representation of the expectation of the Euler characteristic with the help of computer algebra algorithms and perform a numerical analysis of the differential equation. Thus, a new efficient method to numerically evaluate the Euler expectation, when the numerical integration is difficult to perform, is obtained. Last but not least, in B, we give a closed formula, expressed in terms of the Laguerre polynomial, for the expectation of the Euler characteristic for random matrices of general size for the central and scalar covariance case.
2 Expectation of an Euler characteristic number
Let be a real matrix-valued random variable (random matrix) with density
[TABLE]
We assume that is smooth and . Define a manifold
[TABLE]
where , and are column vectors, and is a rank matrix. Set
[TABLE]
and
[TABLE]
Proposition 1**.**
Let be a random matrix as aforementioned. The following claims are equivalent:
- (i)
The function has a critical point at . 2. (ii)
The vectors are left and right eigenvectors of , respectively. In other words, there exists a constant such that , .
Moreover, the function takes value at the critical point .
Proof. We assume that and are expressed by local coordinates and , respectively, where and . We denote by and by . Since , we have , where . We omit , which represents the action, when there is no ambiguity. Analogously, we have , where .
Assume that is a (random real) matrix. Let us consider the function expressed by the local coordinate
[TABLE]
At the critical point of , we have
[TABLE]
The aforementioned equality (2)holds for each , and is a local coordinate of , which implies that all ’s are linearly independent. Therefore, there exists a constant such that at the critical point. Analogously, there exists a constant such that . Let us show that . We have
[TABLE]
and
[TABLE]
Therefore, we have at the critical point.
Conversely, and at a point imply that is a critical point of . ∎
We consider a continuous family of elements of parameterized by the first column vector . In other words, we consider a continuous family of orthogonal frames of parameterized by . An element of is denoted by , where is an matrix. Analogously, we take a family parameterized by , where is an matrix parameterized by . Set
[TABLE]
The matrix can be expressed as
[TABLE]
Intuitively, this is a partial singular value decomposition. We denote the set of all matrices by .
The aforementioned decomposition provides a coordinate system for the space of random matrices . Without loss of generality, we assume that . We sort the singular values of in descending order, and denote by the -th singular value of the matrix . For a real number , we define
[TABLE]
Subsequently, we set
[TABLE]
and
[TABLE]
For a matrix in , we sort the singular values of in descending order as follows:
[TABLE]
Let and be the left and right eigenvectors of for , respectively. Note that and are respective eigenvectors of and for the eigenvalue , which implies that and are uniquely determined modulo the multiplication by . Define a map from to by
[TABLE]
The matrix lies in because the singular values of agree with those of excluding .
Lemma 2**.**
The map in (4) is smooth and isomorphic.
Proof. Define a map from to by
[TABLE]
Based on calculation, we observe that and are identity maps. The map is then one-to-one and surjective. Next, we show that the map is smooth. Since we assume that all the singular values are different, the maps of taking the -th singular value of a given and an eigenvector for the singular value are smooth on an open connected neighborhood of (by checking the Jacobian does not vanish). The inverse map is then locally smooth. Hence, is smooth and isomorphic. ∎
We are interested in the Euler characteristic of .
Theorem 1**.**
Suppose that and is a Morse function for almost all ’s. We further assume that if a set is of measure zero with respect to the Lebesgue measure, it is also a measure zero set with respect to the measure . The expectation of the Euler characteristic number equals
[TABLE]
Here, we set , , where and are the -th column vectors of and , respectively, and
Note that and are and invariant measures on and , respectively.
Proof. Without loss of generality, we assume that . According to Morse theory, if is a Morse function, which is a smooth function without a degenerated critical point, then we have
[TABLE]
where is the -th singular value of , and are left and right eigenvectors, and . The equality (8) is the Morse theorem for manifolds with boundaries. The equalities (8) and (11) can be established as follows.
First, we have the relation . By differentiating it with respect to , we have . Let us evaluate . By the expression , it is equal to
[TABLE]
Next, we evaluate :
[TABLE]
Third, we evaluate :
[TABLE]
Summarizing the aforedescribed calculations, the Hessian is equal to
[TABLE]
Since , the sign of the determinant of the Hessian is equal to that of the middle of the above 3 matrices.
The equalities (11) and (12) can now be established; we fix and omit the superscript in the following discussion. We consider the product of the following two matrices:
[TABLE]
which is equal to
[TABLE]
Since the bottom-left block is , the determinant of this matrix is . Setting and , we have
[TABLE]
Since , the determinant of the matrix above is equal to . In summary, we have obtained equalities of (11) and (12).
With regard to the expectation of the Euler characteristic number, exchanging the sum and the integral, we have
[TABLE]
To evaluate the expectation of the Euler characteristic number, we require the Jacobian of (3). According to standard arguments in multivariate analysis (see, e.g., [34, (3.19)]), we have
[TABLE]
Subsequently, we have
[TABLE]
The factor is owing to the multiplicity of being . Set . For , since and are measure zero sets, we may sum up integral domains for into one domain as
[TABLE]
Thus, we have derived the conclusion. ∎
The integral (5) does not depend on the choice of nor . The reason is as follows. The column vectors of the matrix are of length 1 and are orthogonal to the vector . Let be a matrix with the same property. In other words, we assume . Then there exists an orthogonal matrix such that and hold. Taking the exterior product of elements of , we have
[TABLE]
The case for can be shown analogously.
One of the most important examples is that has a Gaussian distribution , where is the Kronecker product of matrices. In this case, we have
[TABLE]
The largest singular value of is the square root of the largest eigenvalue of a non-central Wishart matrix . Substituting (3) and (13) into (5), we have
[TABLE]
In this expression, the number of parameters is ; therefore, it is over-parameterized. Note that
[TABLE]
Let be a spectral decomposition, where . Then we have
[TABLE]
Let be a QR decomposition, where is lower triangular matrix with nonnegative diagonal elements and . Then . Since the largest eigenvalues of and are the same, we can assume without loss of generality that is a diagonal matrix and is a lower triangle with nonnegative diagonal elements, i.e.,
[TABLE]
When has multiple roots, i.e.,
[TABLE]
by multiplying and its transpose from the left and right, we can assume
[TABLE]
Therefore, our problem can be formalized as follows: Evaluate (14) with parameters (16) (or (17) and (18)).
In the following sections, we will evaluate the integral representation of the expectation of the Euler characteristic given in Theorem 1 for some interesting special cases. We can obtain approximate values of the probability of the largest eigenvalue of random matrices by virtue of them. The Euler characteristic heuristic is
[TABLE]
The condition that is a Morse function with probability one holds if has distinct and non-zero singular values with probability one.
3 The case of
We derive Theorem 1 in the special case of by taking explicit coordinates. This derivation motivates the proof for the general case discussed in the previous section. The case is studied numerically in the last section with the holonomic gradient method (HGM).
Fix two unit vectors
[TABLE]
Define
[TABLE]
which satisfies
[TABLE]
Similarly, we define . Here, in case the sum is greater than , both and should be treated as mod . Now, any matrix, say , can be recovered by
[TABLE]
with variables . We may further assume that , , and .
Fix and let
[TABLE]
By letting vary in and vary in , we recover four times:
[TABLE]
Here, for the first two cases, it is easily seen from the symmetry of the manifold (shown below) that is equivalent to .
- 2.
The second symmetry is given by , i.e., interchanging and . Note that and . Thus, there also exists
[TABLE]
recovering .
Therefore, to recover , we can always assume that and let . See Lemma 2 for a general claim.
Next, we consider the manifold
[TABLE]
and the function on such that
[TABLE]
Apparently, only has two pairs of eigenvectors, which can be verified by the following computations:
[TABLE]
The function has two critical points on , which are at
the point
- 2.
or .
Further computation indicates the following four facts:
- (i)
and ; 2. (ii)
From
[TABLE]
it follows that and . Therefore, we see
- (a)
if , then does not contain any critical points, so ; 2. (b)
if , then contains both critical points, and thus
[TABLE] 3. (c)
the only nontrivial case is , then
[TABLE] 3. (iii)
Since
[TABLE]
we have
[TABLE]
where is the exterior product for vectors. 4. (iv)
Let and such that
[TABLE]
Then
[TABLE]
where
[TABLE]
Hence, we have
[TABLE]
Note that we have by the anti-symmetry of and in this case. In other words, integrals over and are canceled. Thus, we have
[TABLE]
In summary, we have obtained Theorem 1 in the case that has a Gaussian distribution.
A numerical example is given below.
Example 1**.**
We evaluate (19) with parameters
[TABLE]
and derive Table 1.
Here, the probability is estimated by a Monte Carlo simulation with 10,000,000 iterations, and the expectation of the Euler characteristic is evaluated by a numerical integration function NIntegrate on Mathematica. As expected, when is large.
4 Computer algebra and the expectation for small and
In this section, we study the non-central case with the help of computer algebra. When , we can perform the holonomic gradient method (HGM) [14] to evaluate the integral (5).
In Section 3, we derive an integral formula (19) for the case . For (19), we set
[TABLE]
Then we have
[TABLE]
where is a rational function in . Since the integrand is a holonomic function in , we can apply the creative telescoping method [37] to derive holonomic systems for the integrals. This is straightforward for the inner single integral of by the classic methods [23] (such as Zeilberger’s algorithm, Takayama’s algorithm and Chyzak’s algorithm). Below is an example.
Example 2**.**
Consider the inner single integral of (20):
[TABLE]
where is a rational function in . Since is a holonomic function, we can compute a holonomic system satisfied by using the Mathematica package HolonomicFunctions [24]. Using the holonomic system satisfied by and Chyzak’s algorithm [8], we can then derive a holonomic system of , which is of holonomic rank . The detailed calculations can be found in the supplementary material [33].
In the aforementioned example, we use Chyzak’s algorithm to derive a holonomic system of the inner single integral of . This can be done within 5 seconds on a Linux computer with 15.10 GB RAM. However, experiments show that it is not efficient enough to derive a holonomic system for the inner double integral in the same way within reasonable computational time because of the complexity of this algorithm. To speed up the computation, we intend is to utilize the Stafford theorem [17, 30] empirically. Let us first recall the theorem. Assume that is a field of characteristic [math] and is a positive integer. Let and be the ring of differential operators with rational coefficients and the Weyl algebra in variables, respectively.
Theorem 2**.**
Every left ideal in or can be generated by two elements.
Assume that is a left ideal in or . We observe from experiments that for any two random operators , it is of high probability that . This suggests the following heuristic method for computing a holonomic system for the inner double integral of . As a matter of notation, we set
[TABLE]
Recall that a D-finite system [5] in is a finite set of generators of a zero-dimensional ideal in . The relation between D-finite systems and holonomic systems is illustrated in [16, Section 6.9]. For the application of the HGM, D-finite systems are alternative to holonomic systems. Here, we use D-finite systems because they are more efficient for computation.
Heuristic 1**.**
Given a D-finite system in , compute another D-finite system in such that
[TABLE]
- (i)
Choose two finite support set . 2. (ii)
Using the polynomial ansatz method [23, Section 3.4], check whether there exist telescopers of with support sets or not. If and exist, then go to the next step. Otherwise, go to step 1. 3. (iii)
Compute the Gröbner basis of with respect to a term order [9] in . If is D-finite, then output . Otherwise, go to step 1.
In the aforementioned heuristic method, we need to find two finite support set through trial and error so that the computation terminates and finishes in reasonable time. Next, we demonstrate its application to derive a D-finite system for the inner double integral of .
Example 3**.**
Consider the inner double integral of (20):
[TABLE]
where is defined in Example 2.
Let be a D-finite system of , which is derived from Example 2. Using and the polynomial ansatz method, we find two non-zero annihilators and for with support sets and , respectively, where
[TABLE]
Then we compute the Gröbner basis of in with respect to a total degree lexicographic order. We find that is a D-finite system of holonomic rank . The details of the calculation can be found in [33].
In the aforementioned example, we specify the parameters in the integrand as those in Example 1. Using Heuristic 1, we can further compute a holonomic system for the inner double integral of without specifying those parameters (pars). This is significantly more efficient than Chyzak’s algorithm. Table 2 compares Chyzak’s algorithm (chyzak) and Heuristic 1 (heuristic) in terms of computation time (s).
Next, we use Heuristic 1 to derive a D-finite system of the inner triple integral of and then numerically solve the corresponding ordinary differential equation. Finally, we use numerical integration to evaluate .
Example 4**.**
Consider
[TABLE]
where is specified in (21) with parameters
[TABLE]
By Example 3, we have derived a D-finite system for . Using Heuristic 1, we derive a D-finite system for the inner first integral of (22) of the following form:
[TABLE]
where . Now, we first numerically solve the ordinary differential equation to evaluate , and then evaluate by using numerical integration. Table 3 are the corresponding numerical results,
where mc represents the Monte Carlo simulation of by the following formula with 10,000,000 iterations:
[TABLE]
with
[TABLE]
where and are singular values of , .
As expected, the results of the HGM are approximate to those of the Monte Carlo simulation. The detailed computation can be found in [33].
The evaluations of in the above example are also approximate of those given in Example 1. The source codes for this section and a demo notebook are freely available as part of the supplementary electronic material [33].
Example 5**.**
We consider the evaluation of (20) with parameters
[TABLE]
It is difficult to evaluate (20) for the relatively large parameters by numerical integration (even with the Monte Carlo integration). Thus, we take a different approach. Using Heuristic 1, we can compute a linear ordinary differential equation (ODE) for (20) of rank with respect to the independent variable . Then we construct series solutions for this differential equation and use them to extrapolate results by simulations.
Although this extrapolation method is well known, we explain it in a subtle form with application in our evaluation problem. Consider an ODE with coefficients in of rank . Let be a point in the -space and we take increasing numbers , where . We construct a series solution as a series in . We may further assume that is not a singular point of the ODE for each . The initial value vector may be taken suitably so that the series is determined uniquely over .
We assume that the vector converges in a segment containing all ’s and that it is a basis of the solution space. Once we construct such a basis of series solutions, we can construct the solution that takes values at , . To be specific, we set
[TABLE]
with unknown coefficients ’s. Then we have
[TABLE]
The unknown coefficients ’s can be determined by solving the system of linear equations
[TABLE]
We call the extrapolation function by series solutions of ODE. We call the reference value of at reference point .
Let us now come back to our example. The linear ODE for (20) has rank . We set and the ’s as . Then we have
[TABLE]
We construct an approximate series solution by taking terms with rational arithmetic.
We set the reference points , and construct a matrix related to (23). Numbers in the matrix are translated to approximate rational numbers to avoid the instability problem of solving linear equations (23) with floating point numbers.
We assume that the expectation of the Euler characteristic of is almost equal to the probability that the first eigenvalue is larger than . In fact, we have the Euler expectation in this case, where is the -th eigenvalue. We have by the Monte-Carlo simulation with tries. Then we may suppose that reference values are estimated by Monte-Carlo simulation for . We construct a solution with these reference values. Evaluation of is done with big floats.
Table 4 represents the values of the extrapolation function obtained by the above method with the big floats of 380 digits and that by simulation with samples. One simulation takes approximately s by using the R package * mnormt* on a machine with Intel Xeon CPU(2.70GHz) and 256G memory.
The solid line in Fig. 1 is obtained by this extrapolation function. The line goes to a big value at because this is out of the domain of convergence of this approximate series. Dots are values obtained by simulation and those on the thick solid line are values used as reference values to obtain the extrapolation function.
We obtain the series with terms in s by using Risa/Asir on a machine with Intel Xeon CPU(2.70GHz) and 256G memory. The time to evaluate the extrapolation function at points is s. On the other hand, if we want to obtain simulation values at 61 points, we need about . Thus, our extrapolation method is of advantage in evaluating the function for many .
Appendix A Proof of Lemma 1
Recall that we are dealing with the Wishart matrix with given in (15). For an unit vector , is distributed as
[TABLE]
where represents the distribution of times a non-central chi-square random variable with degrees of freedom and non-central parameter .
We consider the case . From the characterization of the largest and smallest eigenvalues, we have and , where and are arbitrary unit vectors.
(i) Suppose that has two distinct eigenvalues. Set and to be two eigenvectors of corresponding to the eigenvalues and . Then
[TABLE]
where , . Note that can be zero.
The tail behaviors of the central and non-central chi-square distributions were investigated by Beran [4]. From (2.9) and (3.3) of [4], combined with the asymptotics for the modified Bessel function of the first kind as , we have
[TABLE]
as . In either case whether is zero or not, the right-hand side of (24) goes to zero as goes to infinity.
(ii) Suppose that and has two distinct eigenvalues. Set and to be two eigenvectors of corresponding to the eigenvalues and , respectively. Then
[TABLE]
which goes to zero as goes to infinity. ∎
Appendix B Central case with a scalar covariance: Selberg type integral and Laguerre polynomials
We assume that (central) and in (16) is a scalar matrix, and we study this case by special functions. Under these assumptions, we show that the expectation of the Euler characteristic can be expressed in terms of a Selberg type integral, which is equal to a Laguerre polynomial in view of the works by Aomoto [3] and Kaneko [20].
Theorem 3**.**
Let
[TABLE]
Assume that the distribution of random matrices is the Gaussian distribution with mean [math] and covariance . In other words, we have
[TABLE]
Then we have
[TABLE]
where are given by (25), (26), (28), (30), (34), respectively.
Proof. For , set
[TABLE]
[TABLE]
Then the matrix can be written as
[TABLE]
We denote by the middle matrix in the above expression.
Set and . We consider the central case in (14). Since and , we have
[TABLE]
It follows from Theorem 1 with being the normal distribution that
[TABLE]
where
[TABLE]
We denote by the -th column vector of and by the column vector of the differential form . Define
[TABLE]
It is an invariant measure for rotations on [18, Theorem 4.2]. We may define analogously.
Moreover, since , we have
[TABLE]
Since there is no involved on the right side of the above identity, we can separate the following integral:
[TABLE]
Therefore, we only need to evaluate the integral
[TABLE]
We denote the integral above by . In terms of , we have
[TABLE]
We make the singular value decomposition of the matrix as , where the matrices , (Stiefel manifold), (see, e.g., [18] and [34, (3.1)]). It follows from [34, (3.1)] that
[TABLE]
[TABLE]
the volume element of the Stiefel manifold, when . Here, is the -th column vector of . Since
[TABLE]
and
[TABLE]
we have
[TABLE]
where ,
[TABLE]
In (28), there is a constant involved in the denominator because in this case copies of the domain cover , and the correspondence between the coordinates of and those of its singular value decomposition is because we have the choice of signs of the eigenvector . For the volumes of and , see, e.g., [38, Proposition 2.23, Theorem 2.24].
In (27), we make a change of variables by . Then we have , and
[TABLE]
Furthermore, we have
[TABLE]
Set and factor out . Then it follows from that
[TABLE]
where
[TABLE]
and
[TABLE]
The integral (29) can be expressed as a polynomial in . Let us derive differential equations for this integral and express it in terms of a special polynomial. We utilize the result by Aomoto [3] and its generalization by Kaneko [20]. In [20], a system of differential equations, special values, and an expansion in terms of Jack polynomials were given for the integral
[TABLE]
[TABLE]
when or . Let us make the coordinate change , , . Then we have , ,
[TABLE]
The integral (31) becomes
[TABLE]
[TABLE]
When , the above integral divided by converges to
[TABLE]
[TABLE]
Let us apply this limiting procedure to derive a differential equation for the above integral. When , the differential equation for the integral (31) is
[TABLE]
where , , . This is the Gauss hypergeometric equation. Set , . Then we can find the limit of this equation when . In fact, it can be performed as follows. Set . Note that (33) is invariant by the scalar multiplication of . Then the limit of
[TABLE]
when , equals
[TABLE]
In particular, when and , we have
[TABLE]
A polynomial solution of the above equation can be written as a constant multiple of the confluent hypergeometric polynomial . Therefore, it follows from (29), (32) and the above argument that
[TABLE]
where
[TABLE]
by taking a limit of the Selberg integral formula [32] with an analogous method as was used when deriving (32). ∎
Let us make a numerical evaluation by utilizing Theorem 3 when . If , we have
[TABLE]
Since
[TABLE]
where the last integral is equal to the upper tail probability of the Gamma distribution with scale parameter and shape parameter . It follows from Theorem 3 that the expectation is equal to
[TABLE]
An R code for evaluating in this case is as follows:
ug2<-function(s,k,x) { return(pgamma(x^2, scale=2/s, shape=k+1/2, lower = FALSE)* gamma(k+1/2)(2/s)^(k+1/2)/2); } ec3<-function(x,s) { cc<- 2(2/pi)^(1/2)s^(1/2); c5<-1; return(ccc5* (ug2(s,0,x)-2sug2(s,1,x)+(1/2)s^2ug2(s,2,x))); }
Draw a graph
curve(ec3(x,1),from=1,to=10)
When , some values are given in Table 5:
We present two graphs in Fig. 2 to compare our approximate formula with the exact values by the Pfaffian of a matrix (see, e.g., [26, 6]). The matrix sizes are and , and . The horizontal axis is . Note that our approximation formula is expressed as a finite sum of terms of incomplete Gamma functions which can be evaluated faster than the Pfaffian of an matrix when becomes larger. The approximation error was evaluated by [27, 28] as
[TABLE]
which is exponentially smaller than
[TABLE]
This explains the very accurate tail behaviors in Figs. 2 and 3. Note that is always negative because (1) is always less than .
The two graphs in Fig. 3 are to compare our approximate formula with values by a simulation of tries. The matrix size is and , respectively, and . The horizontal axis is .
Acknowledgments
The authors would like to thank the Editor-in-Chief, Associate Editors and two anonymous referees who kindly reviewed the earlier versions of this paper and provided valuable comments and suggestions. Besides, we also deeply thank Christoph Koutschan, who is the author of the package HolonomicFunctions used in this study, for his help and encouragement. This research is partially supported by the Austrian Science Fund (FWF): P29467-N32, JSPS KAKENHI Grant Number 16H02792, the UTD start-up grant: P-1-03246, the Natural Science Foundation of USA grants: CCF-1815108 and CCF-1708884, and JST CREST Grant Number JP19209317.
References
- Adler [1981]
R. J. Adler, The geometry of random fields, John Wiley & Sons, Ltd., Chichester, 1981. Wiley Series in Probability and Mathematical Statistics.
- Adler and Taylor [2007]
R. J. Adler, J. E. Taylor, Random fields and geometry, Springer, New York, 2007.
- Aomoto [1987]
K. Aomoto, Jacobi polynomials associated with selberg integrals, SIAM Journal on Mathematical Analysis 18 (1987) 545–549.
- Beran [1975]
R. Beran, Tail probabilities of noncentral quadratic forms, The Annals of Statistics 3 (1975) 969–974.
- Chen et al. [2019]
S. Chen, M. Kauers, Z. Li, Y. Zhang, Apparent singularities of D-finite systems, Journal of Symbolic Computation 95 (2019) 217–237.
- Chiani [2016]
M. Chiani, Distribution of the largest root of a matrix for roy’s test in multivariate analysis of variance, Journal of Multivariate Analysis 143 (2016) 467–471.
- Chikuse [1992]
Y. Chikuse, Properties of Hermite and Laguerre polynomials in matrix argument and their applications, Linear Algebra and its Applications 176 (1992) 237–260.
- Chyzak [2000]
F. Chyzak, An extension of Zeilberger’s fast algorithm to general holonomic functions, Discrete Mathematics 217 (1-3) (2000) 115–134.
- Coxe et al. [2015]
D. A. Coxe, J. Little, D. O’Shea, Ideals, Varieties, and Algorithms, Springer, New York, 4th edition, 2015.
- Danufane et al. [2017]
F. H. Danufane, C. Siriteanu, K. Ohara, N. Takayama, Holonomic gradient method-based cdf evaluation for the largest eigenvalue of a complex noncentral Wishart matrix, 2017. ArXiv:1707.02564.
- Davis [1979]
A. W. Davis, Invariant polynomials with two matrix arguments extending the zonal polynomials: Applications to multivariate distribution theory, Annals of the Institute of Statistical Mathematics 31 (1979) 465–485.
- Davis [1980]
A. W. Davis, Invariant polynomials with two matrix arguments: extending the zonal polynomials, in: P. R. Krishnaiah (Ed.), Multivariate Analysis V, North-Holland Publishing Company, 1980, pp. 287–299.
- Díaz-García and Gutiérrez-Jáimez [2011]
J. A. Díaz-García, R. Gutiérrez-Jáimez, On Wishart distribution: Some extensions, Linear Algebra and its Applications 435 (2011) 1296–1310.
- Hashiguchi et al. [2013]
H. Hashiguchi, Y. Numata, N. Takayama, A. Takemura, Holonomic gradient method for the distribution function of the largest root of a wishart matrix, Journal of Multivariate Analysis 117 (2013) 296–312.
- Hayakawa [1969]
T. Hayakawa, On the distribution of the latent roots of a positive definite random symmetric matrix i, Annals of the Institute of Statistical Mathematics 21 (1969) 1–21.
- Hibi and et al. [2013]
T. Hibi, et al., Gröbner Bases: Statistics and software systems, Springer, New York, 2013.
- Hillebrand and Schmale [2001]
A. Hillebrand, W. Schmale, Towards a effective version of a theorem of Stafford, Journal of Symbolic Computation 32 (2001) 699–716.
- James [1954]
A. T. James, Normal multivariate analysis and the orthogonal group, The Annals of Mathematical Statistics 25 (1954) 40–75.
- James [1955]
A. T. James, The non-central Wishart distribution, Proceedings of the Royal Society of London 229 (1955) 364–366.
- Kaneko [1993]
J. Kaneko, Selberg integrals and hypergeometric functions associated with Jack polynomials, SIAM Journal on Mathematical Analysis 24 (1993) 1086–1110.
- Kang and Alouini [2003]
M. Kang, M. S. Alouini, Largest eigenvalue of complex wishart matrices and performance analysis of MIMO MRC systems, IEEE Journal on Selected Areas in Communications 21 (2003) 418–426.
- Kauers et al. [2009]
M. Kauers, C. Koutschan, D. Zeilberger, Proof of Ira Gessel’s lattice path conjecture, Proceedings of the National Academy of Sciences 106 (28) (2009) 1150211505.
- Koutschan [2009]
C. Koutschan, Advanced applications of the holonomic systems approach, Ph.D. thesis, Johannes Kepler University Linz, 2009.
- Koutschan [2010]
C. Koutschan, HolonomicFunctions user’s guide, Technical Report, Johannes Kepler University Linz, 2010. http://www.risc.jku.at/publications/download/risc_3934/hf.pdf.
- Koutschan et al. [2011]
C. Koutschan, M. Kauers, D. Zeilberger, Proof of George Andrews’s and David Robbins’s -TSPP conjecture, Proceedings of the National Academy of Sciences 108 (6) (2011) 21962199.
- Krishnaiah and Chang [1971]
P. R. Krishnaiah, T. C. Chang, On the exact distributions of the extreme roots of the wishart and manova matrices, Journal of Multivariate Analysis 1 (1971) 108–117.
- Kuriki and Takemura [2001]
S. Kuriki, A. Takemura, Tail probabilities of the maxima of multilinear forms and their applications, The Annals of Statistics 29 (2001) 328–371.
- Kuriki and Takemura [2008]
S. Kuriki, A. Takemura, Euler characteristic heuristic for approximating the distribution of the largest eigenvalue of an orthogonally invariant random matrix, Journal of Statistical Planning and Inference 138 (2008) 3357–3378.
- Kuriki and Takemura [2009]
S. Kuriki, A. Takemura, volume of tubes and the distribution of the maximum of a Gaussian random field, selected papers on probability and statistics, American Mathematical Society Translations Series 2 227 (2009) 25–48.
- Leykin [2004]
A. Leykin, Algorithmic proofs of two theorems of Stafford, Journal of Symbolic Computation 38 (2004) 1535–1550.
- Muirhead [2005]
R. J. Muirhead, Aspects of multivariate statistical theory, Wiley, 2005.
- Selberg [1944]
A. Selberg, Remarks on a multiple integral, Norsk Matematisk Tidsskrift 26 (1944) 71–78.
- Takayama et al. [2019]
N. Takayama, L. Jiu, S. Kuriki, N. Takayama, Y. Zhang, Supplementary electronic material to the article computations of the expected Euler characteristic for the largest eigenvalue of a real non-central Wishart matrix, 2019. https://yzhang1616.github.io/ec1/ec1.html.
- Takemura and Kuriki [1999]
A. Takemura, S. Kuriki, Shrinkage to smooth non-convex cone: Principal component analysis as Stein estimation, Communications in Statistics: Theory and Methods 28 (1999) 651–669.
- Taylor and Worsley [2013]
J. E. Taylor, K. J. Worsley, Detecting sparse cone alternatives for Gaussian random fields, with an application to fMRI, Statistica Sinica 23 (2013) 1629–1656.
- Worsley [1995]
K. J. Worsley, Boundary corrections for the expected Euler characteristic of excursion sets of random fields, with an application to astrophysics, Advances in Applied Probability 27 (1995) 943–959.
- Zeilberger [1991]
D. Zeilberger, The method of creative telescoping, Journal of Symbolic Computation 11 (1991) 195–204.
- Zhang [2015]
L. Zhang, volumes of orthogonal groups and unitary groups, 2015. ArXiv:1509.00537.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Adler [1981] R. J. Adler, The geometry of random fields, John Wiley & Sons, Ltd., Chichester, 1981. Wiley Series in Probability and Mathematical Statistics.
- 2Adler and Taylor [2007] R. J. Adler, J. E. Taylor, Random fields and geometry, Springer, New York, 2007.
- 3Aomoto [1987] K. Aomoto, Jacobi polynomials associated with selberg integrals, SIAM Journal on Mathematical Analysis 18 (1987) 545–549.
- 4Beran [1975] R. Beran, Tail probabilities of noncentral quadratic forms, The Annals of Statistics 3 (1975) 969–974.
- 5Chen et al. [2019] S. Chen, M. Kauers, Z. Li, Y. Zhang, Apparent singularities of D-finite systems, Journal of Symbolic Computation 95 (2019) 217–237.
- 6Chiani [2016] M. Chiani, Distribution of the largest root of a matrix for roy’s test in multivariate analysis of variance, Journal of Multivariate Analysis 143 (2016) 467–471.
- 7Chikuse [1992] Y. Chikuse, Properties of Hermite and Laguerre polynomials in matrix argument and their applications, Linear Algebra and its Applications 176 (1992) 237–260.
- 8Chyzak [2000] F. Chyzak, An extension of Zeilberger’s fast algorithm to general holonomic functions, Discrete Mathematics 217 (1-3) (2000) 115–134.
