Integral Transform Methods in Goodness-of-Fit Testing, II: The Wishart Distributions
Elena Hadjicosta, Donald Richards

TL;DR
This paper develops goodness-of-fit tests for Wishart distributions using Hankel transforms of matrix arguments, with applications to various fields and analysis of test properties.
Contribution
It introduces a new Hankel transform-based method for goodness-of-fit testing of Wishart distributions with theoretical and practical insights.
Findings
Derived the null distribution of the test statistic.
Proved the test's consistency against broad alternatives.
Applied the test to financial data.
Abstract
We initiate the study of goodness-of-fit testing when the data consist of positive definite matrices. Motivated by the recent appearance of the cone of positive definite matrices in numerous areas of applied research, including diffusion tensor imaging, models of the volatility of financial time series, wireless communication systems, and the analysis of polarimetric radar images, we apply the method of Hankel transforms of matrix argument to develop goodness-of-fit tests for Wishart distributions with given shape parameter and unknown scale matrix. We obtain the limiting null distribution of the test statistic and the corresponding covariance operator. We show that the eigenvalues of the operator satisfy an interlacing property, and we apply our test to some financial data. Moreover, we establish the consistency of the test against a large class of alternative distributions and we…
| 2.5 | 3 | 5 | 10 | 20 | 50 | 100 | |
|---|---|---|---|---|---|---|---|
| 8 | 7 | 6 | 4 | 3 | 3 | 2 | |
| 23 | 18 | 14 | 7 | 4 | 4 | 2 |
| 3 | 4 | 5 | 10 | 20 | 50 | 100 | |
| 8 | 7 | 6 | 4 | 3 | 3 | 2 | |
| 39 | 29 | 21 | 9 | 5 | 5 | 2 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Integral Transform Methods in Goodness-of-Fit Testing, II: The Wishart Distributions
Elena Hadjicosta and Donald Richards Department of Statistics, Pennsylvania State University, University Park, PA 16802, U.S.A. E-mail address: [email protected].Department of Statistics, Pennsylvania State University, University Park, PA 16802, U.S.A. E-mail address: [email protected]. MSC 2010 subject classifications: Primary 33C10, 62G10; Secondary 15A52, 62G20, 62H15. Key words and phrases. Bahadur slope; Bessel function of matrix argument; contamination model; contiguous alternative; Frobenius norm; Gaussian random field; generalized Laguerre polynomial; goodness-of-fit testing; Hankel transform of matrix argument; Hilbert-Schmidt operator; hypergeometric function of matrix argument; operator norm; Pitman efficiency; Schur’s lemma; Wishart distribution; zonal polynomials.
Abstract
We initiate the study of goodness-of-fit testing when the data consist of positive definite matrices. Motivated by the recent appearance of the cone of positive definite matrices in numerous areas of applied research, including diffusion tensor imaging, models of the volatility of financial time series, wireless communication systems, and the analysis of polarimetric radar images, we apply the method of Hankel transforms of matrix argument to develop goodness-of-fit tests for Wishart distributions with given shape parameter and unknown scale matrix. We obtain the limiting null distribution of the test statistic and the corresponding covariance operator. We show that the eigenvalues of the operator satisfy an interlacing property, and we apply our test to some financial data. Moreover, we establish the consistency of the test against a large class of alternative distributions and we derive the asymptotic distribution of the test statistic under a sequence of contiguous alternatives. We establish the Bahadur and Pitman efficiency properties of the test statistic and we show the validity of a modified Wieand condition.
Contents
-
2 Wishart Distributions and Hankel Transforms of Matrix Argument
-
2.2 Bessel functions and Laguerre polynomials of matrix argument
-
2.4 Orthogonally invariant Hankel transforms of matrix argument
-
3.3 Eigenvalues and eigenfunctions of the covariance operator
-
4.3 The distribution of the test statistic under contiguous alternatives
1 Introduction
In this paper, we develop goodness-of-fit tests for the Wishart distributions, extending the results of [7, 65] for the exponential distributions and [32, 33] for the gamma distributions. In recent years, the cone of positive definite matrices has arisen in numerous applications, e.g., diffusion tensor imaging, financial time series, wireless communication systems, and polarimetric radar images; it is these applications that motivate our study of goodness-of-fit tests for probability distributions on the cone.
Positive definite random matrix data have appeared in medical research, specifically in diffusion tensor imaging (DTI), cf. [22, 39, 40, 44, 50, 58, 59, 60]. DTI is a magnetic resonance imaging method that has attracted much interest in the study of brain diseases. DTI is based on the observation that water molecules in vivo are always in motion; by modelling the diffusion of the water molecules at any location by a three-dimensional Brownian motion, the resulting diffusion tensor image is represented by the positive definite matrix of the local diffusion process at the given location.
Although DTI is non-invasive, it enables the study of deep brain white-matter fibers. Thus, DTI has been used to study epileptic seizures, Alzheimer’s disease, traumatic brain injuries, aging, white-matter abnormalities, developmental disorders, and psychiatric conditions [54, 57, 55, 52]. DTI has also been used to study the pathology of organ and tissue types such as the breast, cardiac, kidney, lingual, skeletal muscles, and spinal cord [19]. In numerous articles, the Wishart distribution with known degrees-of-freedom and unknown scale matrix has been used to model DTI data [22, 39, 40].
The Wishart distributions with known degrees-of-freedom also arise in stochastic volatility models [4, 27, 47]. In this area, the problem is to estimate the covariance matrix of the joint capital returns on several financial assets, with the goal of predicting future returns, devising portfolio allocations, pricing options, and estimating risk.
The complex Wishart distributions with known degrees-of-freedom arise in the spectral analysis of multivariate Gaussian time series [26], wireless communications [63, 64, 67] and the analysis of polarimetric synthetic aperture radar [2, 3]. These applications are widespread, for the spectral analysis of such time series arises in signal processing, econometrics, meteorology, and polarimetric radar has become an important remote sensing device due to its heightened ability to distinguish between distinct scattering sources. The results to follow can be extended, after making obvious necessary changes, to the complex Wishart distributions [38, p. 488] and even to the Wishart distributions on general symmetric cones [24].
The technical details required to develop goodness-of-fit tests for positive definite matrix data are extensive. Naturally, we will need mathematical analysis on the cone of positive definite matrices [51], the Bessel and Laguerre polynomials of matrix argument and their zonal polynomial expansions [28, 35, 38, 53], and Hankel transforms of matrix argument [35]. Further complications arising from the non-commutative nature of matrix multiplication leads us to impose on the distribution of the sample data an orthogonal invariance condition. In addition, the Frobenius, spectral, and operator norms arise in the matrix case, and numerous inequalities between them will be needed. There is also the surprising appearance of Schur’s lemma, a result well-known in linear algebra but which appears only rarely in statistical inference.
We now describe the results in the paper. Throughout, we will follow as templates the presentations in [7, 33, 65]. In Section 2 we provide some properties of the Wishart distributions, and related results for the Bessel functions, Hankel transforms, confluent hypergeometric functions, and generalized Laguerre polynomials, all of matrix argument. Further, we present uniqueness theorems for the Hankel transform of matrix argument, a Hankel inversion formula, and some limit theorems. After providing results on a generalized hypergeometric function of two matrix arguments, we define the orthogonally invariant Hankel transform and present some of its properties.
In Section 3, we propose an integral-type test statistic for goodness-of-testing for the Wishart distributions. Generalizing the one-dimensional cases [7, 33, 65], the statistic is a squared integral, (3.3), involving the empirical orthogonally invariant Hankel transform. We obtain the asymptotic distribution of under the null hypothesis, proving that converges in distribution to a weighted sum of independent and identically distributed random variables, each having a chi-square distribution with one degree-of-freedom. The coefficients of the weighted sum are the positive eigenvalues of the covariance operator corresponding to a certain zero-mean Gaussian random field. The determination of the multiplicity of the eigenvalues remains an open problem, but we show that the eigenvalues satisfy an interlacing property and we show the usefulness of the interlacing property in an application of the test statistic to financial data. Also, we establish the consistency of the test against a large class of alternative distributions.
In Section 4, we derive the asymptotic distribution of under certain sequences of contiguous alternatives to the null hypothesis. Specifically, we consider Wishart alternatives with varying shape or scale parameters, some classes of contaminated Wishart models in which the contamination distribution is a generalized inverted Gaussian.
Finally, in Section 5, we establish the Bahadur and Pitman efficiency properties of the statistic . We investigate the approximate Bahadur slope of under local alternatives and we show the validity of a modified Wieand condition. A complete extension of Wieand’s condition, under which the Bahadur and Pitman efficiencies coincide, remains an open problem.
2 Wishart Distributions and Hankel Transforms of Matrix Argument
2.1 Preliminary results for the Wishart distributions
Throughout the paper, all needed results on the zonal polynomials and on the special functions of matrix argument are provided by Herz [35], Muirhead [53], or Richards [56], so we will generally conform to the notation in those sources. We denote the zero matrix of any order by [math], the order being always determined by the context;further will denote the identity matrix. We also denote by the space of (real) matrices, by the space of symmetric matrices, by the cone of positive-definite matrices, and by the group of orthogonal matrices. To specify that , we usually write ; more generally, we write whenever . Further, we denote the trace of by , the determinant of by and we write for .
The multivariate gamma function is defined by
[TABLE]
for , Re; this integral is well-known to have the explicit formula,
[TABLE]
A positive-definite random matrix is said to have a Wishart distribution if its probability density function (p.d.f.) is of the form
[TABLE]
, where and . We write whenever (2.1) holds. The parameter is called the shape parameter and is called the scale matrix of . If is a half-integer then is called the degrees-of-freedom of . In general, ; also, if is a matrix of rank , where , then [53, p. 92, Theorem 3.2.5].
A partition is a vector of nonnegative integers, listed in non-increasing order. The weight of is , and the length, , of is the number of non-zero , .
For and , the shifted factorial is defined as . For any partition , the partitional shifted factorial is defined as
[TABLE]
For , we denote by the th principal minor of , . For any partition , the zonal polynomial is defined as
[TABLE]
where is the normalized Haar measure on [56, (35.4.2)]. By(2.2), is homogeneous of degree .
It also follows from the invariance of the Haar measure that for all and ; hence, depends only on the eigenvalues of and it is a symmetric function of the eigenvalues. Suppose that and that denotes the unique positive definite square root of . Since the matrices , , and all have the same eigenvalues we will follow a widely-adopted convention, writing or for ; throughout the paper, we retain this convention for all orthogonally invariant functions of matrix argument.
With the normalization
[TABLE]
the zonal polynomials satisfy the identity,
[TABLE]
(see [53, Eq. (iii), p. 228] or [56, Eq. (35.4.6)]). Further, for , the zonal polynomials satisfy the mean-value property [53, p. 243],
[TABLE]
We will also need in the sequel the identity,
[TABLE]
, . This result is established by applying a power series identity,
[TABLE]
; see [38, p. 495, Eq. (143)], [53, p. 259, Eq. (4)]. Writing
[TABLE]
then (2.6) is obtained by comparing the coefficients of in (2.7) and (2.8).
The zonal polynomials also satisfy a Laplace transform identity [53, p. 248]: For Re, , and ,
[TABLE]
For , this result reduces to
[TABLE]
from which we confirm that (2.1) is a probability density function [53, p. 61].
2.2 Bessel functions and Laguerre polynomials of matrix argument
The Bessel function of matrix argument, first treated in detail by Herz [35], can be defined in several ways. Let be such that for all ; these restrictions ensure that for all partitions . Following Muirhead [53, Chapter 7], the Bessel function (of the first kind) of order is defined for as
[TABLE]
We also refer to [24, 28, 38, 56] for further details of these Bessel functions. In particular, the series (2.11) converges absolutely for all [28, Theorem 6.3].
For Re, the Bessel function is also given by Herz’s generalization of the classical Poisson integral [35, Eq. (3.6´)]: For any matrix ,
[TABLE]
where and the integral is with respect to Lebesgue measure on the set . This result leads to an inequality that will arise repeatedly in the sequel.
Lemma 2.1**.**
For Re and ,
[TABLE]
Proof. Since then it follows from (2.11) and (2.12) that
[TABLE]
For Re, symmetric, and , the Bessel function of matrix argument satisfies the Laplace transform identity,
[TABLE]
Indeed, this identity is Herz’s original definition of [35, Eq. (2.5)].
Herz [35, Eq. (5.8)] also obtained a fundamental generalization of a classical formula known as Weber’s second exponential integral: For Re, symmetric matrices and , and ,
[TABLE]
Let where , . The confluent hypergeometric function of matrix argument is defined, for , as
[TABLE]
We will make repeated use of Kummer’s formula [35, Eq. (2.8)], [53, p. 265], [56, §35.8]:
[TABLE]
There is a Laplace transform relationship between the Bessel function and the confluent hypergeometric function function [35, p. 489, Eq. (2.11)]: For Re, symmetric , and ,
[TABLE]
This result can also be proved by expressing as a series of zonal polynomials and then applying (2.9) to integrate term-by-term.
Given partitions and , we denote by the generalized binomial coefficient [53, pp. 267-269], [56, Eq. (35.6.3)]. For and , the (generalized) Laguerre polynomial , corresponding to , is defined as
[TABLE]
Setting in (2.19), we obtain
[TABLE]
The normalized (generalized) Laguerre polynomial corresponding to is defined by
[TABLE]
. By [53, Theorem 7.6.5], the polynomials are orthonormal with respect to the Wishart distribution :
[TABLE]
By [53, p. 282], for and , there holds the Laplace transform,
[TABLE]
Further, by [53, Theorem 7.6.4, p. 284], for and ,
[TABLE]
Lemma 2.2**.**
Let and , then
[TABLE]
Also, for , ,
[TABLE]
Proof.
[TABLE]
Applying (2.9) to evaluate the latter integral, we obtain (2.25).
To establish (2.2), we substitute into (2.23), obtaining
[TABLE]
Differentiating both sides of the latter equation with respect to and simplifying the outcome, we obtain the stated result. ∎
2.3 Hankel transforms of matrix argument
Throughout the rest of the paper, if is a random entity, we denote expectation with respect to the distribution of by or simply by whenever the context is clear.
Let be a random matrix with probability density function . For Re, we define the Hankel transform of order of as the function
[TABLE]
. The Hankel transform satisfies the following properties:
Lemma 2.3**.**
For Re, for all , and is a continuous function of .
Proof.
By (2.13),
[TABLE]
for all . Therefore, by the triangle inequality, .
Since is bounded and continuous in for every fixed , the continuity of follows by Dominated Convergence. ∎
Example 2.4**.**
Let , , . For , it follows from the definition (2.27) of the Hankel transform that
[TABLE]
Applying (2.18) to calculate this integral, we obtain
[TABLE]
For the case in which , (2.28) reduces to
[TABLE]
Example 2.5**.**
Let and be a random matrix that is independent of . For ,
[TABLE]
To prove this result, we again apply (2.27) and the independence of and , obtaining
[TABLE]
Since , we have
[TABLE]
Applying Example 2.4, we obtain
[TABLE]
Combining (2.30) and (2.31), we obtain (2.5).
In particular, if then, by Kummer’s formula (2.17), we obtain
[TABLE]
the Laplace transform of .
Throughout the remainder of the paper, if and are random entities we write whenever and have the same distribution. If , is a sequence of random entities, we write whenever converges in distribution to .
Theorem 2.6**.**
(Uniqueness of the Hankel transform).*
Let and be positive definite random matrices with Hankel transforms and , respectively. Then if and only if .*
Proof.
If then it is clear that =.
Conversely, suppose that , independently of and . Let and be the Laplace transforms of and respectively; then, for all ,
[TABLE]
and therefore
[TABLE]
By Example 2.5,
[TABLE]
and
[TABLE]
for all . Combining (2.32), (2.33) and (2.34), we obtain , for all . By the uniqueness theorem for multivariate Laplace transforms [23, p. 16, Theorem 2.1.9] we conclude that . ∎
We denote by the space of functions such that
[TABLE]
The following inversion theorem is obtained by applying the Hankel inversion theory of Herz [35, Section 3]. We refer to Hadjicosta [32] for the full details.
Theorem 2.7**.**
(Inversion of the Hankel transform)*.
Let be a random matrix with Hankel transform , and with a probability density function . Then,*
[TABLE]
Theorem 2.8**.**
(Hankel Continuity). Let be a sequence of positive-definite random matrices with corresponding Hankel transforms . If there exists a positive semi-definite random matrix with Hankel transform such that then, for each ,
[TABLE]
Conversely, suppose there exists a function such that as , is continuous at [math], and (2.35) holds. Then is the Hankel transform of an positive semi-definite random matrix , and .
Proof.
Suppose that then, by the Continuous Mapping Theorem for random vectors [61, p. 336], as , for all . By (2.13), is uniformly bounded for all and ; thus, by the Dominated Convergence Theorem, as , for all , and therefore (2.35) holds.
Conversely, suppose that where is independent of the sequence . Also, let be the Laplace transform of . By Example 2.5, we have
[TABLE]
for all . Further, by Lemma 2.3, for all . Thus, by the Dominated Convergence Theorem, as ,
[TABLE]
for all . Since is continuous at [math] and then also is continuous at [math] and . By the continuity theorem for multivariate Laplace transforms [41, p. 63, Theorem 4.3], there is a positive semi-definite random matrix whose Laplace transform is , and . ∎
The next result constitutes a characterization of the Wishart distributions using the Hankel transform , where Re(. The result enables the extension, to the Wishart case, of some results of Baringhaus and Taherizadeh [8] on a supremum norm test statistic.
Theorem 2.9**.**
Let be an positive-definite random matrix with an orthogonally invariant distribution and Hankel transform . If there exist and such that for all satisfying ,
[TABLE]
then .
We refer the reader to Hadjicosta [32], where three proofs of this result are given. We provide here the third and briefest proof, which uses the principle of analytic continuation.
Proof.
The Hankel transform, , of is holomorphic (analytic) in . Also, the hypergeometric function is holomorphic in . Since these two functions agree on the open neighborhood then, by analytic continuation, they agree wherever they both are well-defined. Since they both are well-defined everywhere then we conclude that for all . By Example 2.4 and Theorem 2.6, the uniqueness theorem for Hankel transforms, it follows that . ∎
2.4 Orthogonally invariant Hankel transforms of matrix argument
For such that , for all , and , the Bessel function (of the first kind) of order with two matrix arguments is defined as the infinite series
[TABLE]
It is straightforward from (2.5) and (2.11) to see that
[TABLE]
[53, p. 260]. Also, by applying the inequality (2.13) for , we obtain
[TABLE]
Definition 2.10**.**
Let be an positive-definite random matrix with p.d.f. . For Re and , we define the orthogonally invariant Hankel transform of order of as the function
[TABLE]
Remark 2.11**.**
By (2.37) and the definition (2.27) of , we have
[TABLE]
Further, since , then also satisfies the properties in Lemma 2.3.
Let , , where , for all . The confluent hypergeometric function of two matrix arguments is defined, for , as the infinite series,
[TABLE]
It is clear from the definition that Similar to (2.37), it follows from (2.5) that for ,
[TABLE]
Example 2.12**.**
Let where and . For , it follows from Example 2.4, (2.40), and (2.42) that
[TABLE]
Theorem 2.13**.**
(Uniqueness of orthogonally invariant Hankel transforms).*
Let and be positive-definite random matrices with orthogonally invariant distributions and orthogonally invariant Hankel transforms and , respectively. Then if and only if .*
Proof.
By Eq. (2.37) and the definition of the orthogonally invariant Hankel transform (2.39), we have
[TABLE]
Since the distribution of is orthogonally invariant, for all ; therefore, for all ,
[TABLE]
and similarly for . By applying Theorem 2.6, the Uniqueness Theorem for Hankel transforms, we deduce the desired result. ∎
3 Goodness-of-Fit Tests for the Wishart Distributions
3.1 The test statistic
Let be independent, identically distributed (i.i.d.), positive-definite random matrices, each with probability density function and positive-definite mean . We assume also that the density function of is of the form
[TABLE]
where is orthogonally invariant.
Lemma 3.1**.**
Under the assumption (3.1), the distribution of is orthogonally invariant.
Proof.
Let ; then and the Jacobian of the transformation from to is [53, p. 58]. Therefore, the p.d.f. of is
[TABLE]
Since is orthogonally invariant then it follows that is orthogonally invariant. ∎
Denote by the distribution of . On the basis of the random sample , we wish to test the null hypothesis, , against the alternative, , where is known.
Since is unspecified by , the data cannot be used to construct a test statistic. Thus, with denoting the sample mean, define , for . Under , the distribution of does not depend on , so a test statistic can be based on them. Let denote the probability measure corresponding to the distribution. For Re, define the empirical orthogonally invariant Hankel transform of order of as
[TABLE]
. Further, define the test statistic
[TABLE]
To provide motivation for this test statistic, suppose that is valid; then and, for large , we can expect that , almost surely. By the Continuous Mapping Theorem, the sequence of random variables should approximate the i.i.d. sequence , , for each and for sufficiently large . Applying to (3.2) the Strong Law of Large Numbers, we can expect that, for large , , almost surely.
By Example 2.12, we deduce that
[TABLE]
for . Therefore, by Lemma 3.1 and Theorem 2.13, small values of provide strong evidence in support of , and we will reject for large values of .
For the remainder of the paper, we set
[TABLE]
Since then . We also denote and by and , respectively. By Kummer’s formula (2.17), the statistic (3.3) becomes
[TABLE]
This integral represents as a weighted integral of the squared difference between the empirical orthogonally invariant Hankel transform and its almost sure limit under the null hypothesis.
We now evaluate the test statistic for a given random sample.
Proposition 3.2**.**
The test statistic (3.4) is a -statistic of order 2. Specifically,
[TABLE]
where, for ,
[TABLE]
Proof.
After squaring the integrand in (3.4), we see that there are three terms to be computed. First,
[TABLE]
By (2.37) and Fubini’s theorem,
[TABLE]
Writing , , and applying Herz’s generalization (2.15) of Weber’s second exponential integral, we find that (3.5) equals
[TABLE]
On the right-hand side of (3.6), we replace by and apply the group invariance of the Haar measure and its normalization; then we find that (3.6) reduces to
[TABLE]
Therefore,
[TABLE]
The second term to be calculated is
[TABLE]
Similar to the previous calculation, we use (2.37) to express as an average over and apply Fubini’s theorem to reverse the order of integration. The resulting integral is a special case of (2.14), so we conclude that
[TABLE]
The third and last integral, which we evaluate using the gamma integral (2.10) is
[TABLE]
Collecting together the three terms, we obtain the desired result. ∎
3.2 The limiting null distribution of the test statistic
We denote by the space of (equivalence classes of) orthogonally invariant Borel measurable functions that are square-integrable with respect to the probability measure , i.e., for which . The space is a separable Hilbert space when equipped with the inner product
[TABLE]
and the corresponding norm
[TABLE]
. Moreover, the set of normalized Laguerre polynomials , with ranging over all partitions, defined in Section 2.2, forms an orthonormal basis for the space ; see Herz [35, p. 502, Theorem 4.6] and Constantine [17, Section 3].
We now define the stochastic process
[TABLE]
. We view the random field as a random element in since, as we now show, its sample paths are in .
Lemma 3.3**.**
The test statistic (3.4) can be written as
[TABLE]
In particular, .
This result follows immediately from (3.2), (3.4), and (3.7).
Remark 3.4**.**
By [29, Example 1.4] has a matrix Liouville distribution, of the second kind, that does not depend on . Therefore, without loss of generality, we will set in deriving the limiting null distribution of .
We also note that, for each , the matrices and have the same spectrum; this result is proved by verifying that and have the same characteristic polynomial. Consequently,
[TABLE]
, so we can replace by in the definition (3.2) of the test statistic.
We now state the main result of this section.
Theorem 3.5**.**
Let and be i.i.d. -distributed random matrices, where , and let be the random field defined in (3.7). Then, there exists a centered Gaussian field , with sample paths in and with covariance function,
[TABLE]
, such that in as . Moreover,
[TABLE]
The remainder of this section is devoted to proving Theorem 3.5, so readers who wish to postpone reading the detailed derivation may continue directly to Section 3.3.
3.2.1 Preliminary details
Here, we provide details on the Frobenius norm of a matrix, the Taylor expansion of functions on the space of symmetric matrices, and various preliminary lemmata necessary for the derivation of the asymptotic distribution of .
For , the inner product between and is defined by and the Frobenius norm of is defined by By [37, Section 5.6, p. 291], the Frobenius norm satisfies the triangle inequality, and moreover, it is sub-multiplicative,
We use the usual notation for Kronecker’s delta, viz., or [math] for or , respectively. For , the gradient operator is the matrix
[TABLE]
For example, is straightforward to see that
Let be a function; that is, is differentiable of order and its partial derivatives are continuous on . The Taylor expansion of order of the function , at , is
[TABLE]
where , for some .
Lemma 3.6**.**
For ,
[TABLE]
where and .
Proof.
By (2.37),
[TABLE]
It is straightforward to verify that the conditions given by Burkill and Burkill [14, p. 289, Theorem 8.72] for interchanging derivatives and integrals are satisfied; therefore,
[TABLE]
Setting and , we have . By Maass [51, p. 64], ; therefore,
[TABLE]
since is scalar-valued. Combining (3.12) and (3.2.1), we obtain (3.11). ∎
We note that all further interchanges of derivatives and integrals are justifiable by appeal to [14, loc. cit.], so we will perform such interchanges without further citation. Also, various positive constants arise in the following calculations, and we will denote them generically by , .
Lemma 3.7**.**
Let be an matrix such that . Also, let be an positive-definite matrix. Then, there exists a constant such that
[TABLE]
Proof. Since the trace is a linear operator, we have
[TABLE]
where is the Kronecker product of the gradient acting on the matrix , and V_{ij}:=\big{(}\nabla_{Y}\otimes Y^{1/2}\big{)}_{ij} is the th block matrix in that Kronecker product.
By the Cauchy-Schwarz inequality, and the fact that implies , we obtain
[TABLE]
Recall from [12, p. 13] the multi-linear operator norm, , which we define here in the following context: If denotes the th element of a matrix and denotes the th element of V_{ij}:=\big{(}\nabla_{Y}\otimes Y^{1/2}\big{)}_{ij}, the th block in the tensor product , then
[TABLE]
and we define
[TABLE]
Since all norms on a finite-dimensional space are equivalent, there exists a constant such that . By [21, p. 262, Eq. (6)], there holds the crucial inequality,
[TABLE]
Hence,
[TABLE]
so we obtain
[TABLE]
Combining (3.2.1) and (3.16), we obtain (3.14). ∎
Lemma 3.8**.**
For , there exists a constant such that
[TABLE]
Proof.
By Eq. (3.11),
[TABLE]
where and . By Minkowski’s inequality for integrals,
[TABLE]
since the Frobenius norm is sub-multiplicative.
By Herz’s generalization, (2.12), of the Poisson integral,
[TABLE]
where . Therefore,
[TABLE]
Applying Minkowski’s inequality and then using (3.14) to bound the integrand, we obtain
[TABLE]
Combining (3.2.1) and (3.2.1), we obtain
[TABLE]
For , and
[TABLE]
Hence,
[TABLE]
which completes the proof. ∎
Lemma 3.9**.**
For , there exist constants such that
[TABLE]
Proof.
By (3.11),
[TABLE]
where , , and . Applying (2.12) and interchanging derivatives and integrals, we obtain
[TABLE]
where . Therefore,
[TABLE]
Let and , ; then we observe that
[TABLE]
since . Also, using the identity
[TABLE]
we find that
[TABLE]
By applying the same argument as in Lemma 3.7, we obtain
[TABLE]
so, by the Cauchy-Schwarz inequality and the fact that implies , we obtain
[TABLE]
Since the norms and are equivalent, there exists such that
[TABLE]
By a result of Del Moral and Niclas [21, Theorem 1.1, Eq. (4)],
[TABLE]
where is the matrix exponential function. Therefore,
[TABLE]
For any matrices and , and for any such that ,
[TABLE]
Now setting , , we obtain
[TABLE]
Therefore,
[TABLE]
For any positive-definite matrix and for ,
[TABLE]
hence, for , and ,
[TABLE]
Therefore, for , the right-hand side of (3.25) is bounded above by
[TABLE]
Define , , and , . Notice that
[TABLE]
and
[TABLE]
with . Then satisfies the inhomogeneous differential equation
[TABLE]
with boundary condition . By following the approach of Kågström [42, Section 4], we find that the solution of this differential equation is
[TABLE]
By Minkowski’s inequality and the sub-multiplicative property of the Frobenius norm,
[TABLE]
Using (3.26) to bound both exponential terms in this integrand, we find that
[TABLE]
Assuming that , we calculate the latter integral, obtaining
[TABLE]
Combining (3.23)-(3.27), we obtain
[TABLE]
By continuity, this result remains valid for .
Next, it follows from (3.2.1) that
[TABLE]
By the Cauchy-Schwarz inequality,
[TABLE]
and by (3.14),
[TABLE]
Therefore, with , we have derived
[TABLE]
By (3.21), Minkowski’s inequality, and the sub-multiplicative property of the Frobenius norm, we obtain
[TABLE]
Applying the bound (3.28), we find that
[TABLE]
By a result of Wihler [70, Eq. (3.2)],
[TABLE]
Since , , and , then we have
[TABLE]
Also, for ,
[TABLE]
Combining (3.29)-(3.32), and using the fact that is normalized, we obtain
[TABLE]
which is identical with (3.20). ∎
Let be a Wishart-distributed random matrix, , and define for positive definite matrices the matrix-valued function
[TABLE]
Lemma 3.10**.**
For ,
[TABLE]
Proof. We will establish this result by the method of Laplace transforms. For , the Laplace transform of the function is
[TABLE]
We substitute (3.33) into this integral, interchange the trace and expectation, apply Fubini’s theorem to interchange the expectation and the integral, and verify the validity of interchanging derivatives and integrals; then we obtain
[TABLE]
Applying (2.37) to write as an average of its single-matrix argument counterpart, and reversing the order of integration, we obtain
[TABLE]
The inner integral with respect to is precisely the Laplace transform (2.14); substituting the outcome of that calculation into (3.37), we obtain
[TABLE]
Interchanging the gradient and the integral, and then the integral and the trace, noting that
[TABLE]
we find that
[TABLE]
since the trace and the integral commute.
Next, we have
[TABLE]
by interchanging integral and derivative. By [53, p. 279, Eq. (41)],
[TABLE]
differentiating this series term-by-term and evaluating the outcome at , we find that (3.39) equals
[TABLE]
By (2.9), ; therefore, by combining (3.38)-(3.40), we obtain
[TABLE]
It is also known from [53, p. 248] that
[TABLE]
for , where denotes the maximum of the absolute values of the eigenvalues of . Differentiating this series term-by-term with respect to , we obtain
[TABLE]
now setting and comparing the outcome with (3.41), we find that
[TABLE]
Therefore, by (2.10),
[TABLE]
evidently a Laplace transform. Comparing this expression with (3.35) then the conclusion follows from the uniqueness theorem for Laplace transforms. ∎
Lemma 3.11**.**
For ,
[TABLE]
Proof. Define for the function
[TABLE]
By (3.33), , where . Since the distribution of is orthogonally invariant, i.e., for all , then
[TABLE]
By (3.44),
[TABLE]
By Maass [51, p. 64], ; so it follows that
[TABLE]
However, for all ; therefore,
[TABLE]
Substituting this result into (3.2.1) we obtain, for all ,
[TABLE]
Since for all then, by Schur’s Lemma [62, p. 315], is a scalar matrix, i.e., for some scalar . By taking traces and by applying (3.34), we obtain
[TABLE]
therefore,
[TABLE]
The proof is now complete. ∎
The final preliminary result needed for the proof of Theorem 3.5 is the following consequence of [43, Lemma 7, Eq. (20)].
Lemma 3.12**.**
The integrals
[TABLE]
are finite for all . Further, the integral
[TABLE]
is finite for all .
3.2.2 The proof of the limiting distribution
In what follows, we will use for various matrices the shorthand notation
[TABLE]
Proof of Theorem 3.5. By (3.10), the Taylor expansion of the Bessel function at is
[TABLE]
where , for some . Setting and , , in (3.46), we have the Taylor expansion of order 1 of at :
[TABLE]
where , for some . Define
[TABLE]
then (3.47) reduces to
[TABLE]
Adding and subtracting the term on the right-hand side, we obtain
[TABLE]
where the second equality is obtained by permuting terms cyclically in the inner product. For and , , define the function
[TABLE]
We remark that as are i.i.d. then does not depend on ; hence,
[TABLE]
is a function evaluated earlier; by (3.43),
[TABLE]
Define the random fields , and , , by
[TABLE]
The random fields , arise as follows. To define , we use the first two terms in (3.2.2). To define , we use the same expression from except that the term is replaced by its expected value , which is given by (3.43). To define , we replace the term in by a constant multiple of , the constant being obtained by applying the Law of Large Numbers to . We will show that
[TABLE]
By writing as
[TABLE]
it will follow that in (cf. Billingsley [10, p. 25, Theorem 4.1]).
To establish (3.49), define for ,
[TABLE]
. Since then and therefore, since the trace and the expectation are linear operators, we deduce that
[TABLE]
Also, by Example 2.12 and (2.17), we have Therefore, , for all and , and it is also clear that are independent and identically distributed random elements in .
We now show that for . We have
[TABLE]
By the Cauchy-Schwarz inequality, for ; so to prove that , it suffices to prove that
[TABLE]
[TABLE]
and
[TABLE]
To establish (3.54), we apply (2.38) to obtain
[TABLE]
To prove (3.55), write
[TABLE]
therefore, the integral in (3.55) is a constant multiple of
[TABLE]
Since is a polynomial in , its expectation is finite because the moment-generating function of exists. As for
[TABLE]
again this integral is finite because is a polynomial and , after normalization, is a Wishart measure. For the same reason, (3.56) is valid.
In summary, for and , are i.i.d. random elements in with and . Therefore, by the Central Limit Theorem in ,
[TABLE]
where is a centered Gaussian random element in . Moreover, has the same covariance operator as .
It is well-known that the covariance operator of the random element is uniquely determined by the covariance function of the random field ; cf., Gīkhman and Skorohod [25, pp. 218-219].
We now show that the function in (3.9) is the covariance function of . Noting that for all , we obtain
[TABLE]
By (3.53),
[TABLE]
so the calculation of reduces to evaluating the four terms obtained by expanding the product on the right-hand side of (3.2.2).
The first term in the product in (3.2.2) is
[TABLE]
By (2.15), (2.37), and Fubini’s theorem, we find that this term equals
[TABLE]
Since , and
[TABLE]
we conclude that the first term equals
[TABLE]
The second term in the product in (3.2.2) is
[TABLE]
We have seen earlier that
[TABLE]
Also, by (2.37),
[TABLE]
Since then, by [53, p. 442], the expectation is a multiple of the expected value of a noncentral Wishart distributed random matrix , where is the matrix of noncentrality parameters. Hence,
[TABLE]
Substituting this result into (3.64), we obtain
[TABLE]
Substituting (3.63) and (3.65) into (3.62), and simplifying the result, we find that the second term equals
[TABLE]
The third term in the product in (3.2.2) is
[TABLE]
which is the same as the second term but with and interchanged.
The fourth term in the product in (3.2.2) is
[TABLE]
Using the explicit formula for from (3.34) and (3.43), we obtain
[TABLE]
By (2.4) and (2.9), it follows that
[TABLE]
Also, using (2.3), we obtain
[TABLE]
Substituting (3.67) and (3.2.2) into (3.66), we deduce that the fourth term equals
[TABLE]
Combining all four terms, we obtain (3.9).
To establish (3.50), we begin by showing that
[TABLE]
converges in distribution to a random variable with finite variance. By the multivariate Central Limit Theorem, converges in distribution to a multivariate normal random vector. Also, by the Law of Large Numbers, . Therefore, by Slutsky’s theorem, converges in distribution to a multivariate normal random vector, so it follows from the Continuous Mapping Theorem that converges in distribution to a random variable which has finite variance.
By the Taylor expansion (3.2.2),
[TABLE]
Define
[TABLE]
By the Cauchy-Schwarz inequality,
[TABLE]
so we will establish (3.50) by proving that .
By the triangle inequality and the sub-multiplicative property of the Frobenius norm, we have
[TABLE]
Applying (3.20), we obtain
[TABLE]
Also, since , , then
[TABLE]
Define
[TABLE]
and
[TABLE]
By the Cauchy-Schwarz inequality, . Thus, it suffices to show that .
We first establish that . By the Cauchy-Schwarz inequality,
[TABLE]
By Weyl’s inequality for the smallest eigenvalue of the sum of two symmetric matrices,
[TABLE]
therefore,
[TABLE]
By the Law of Large Numbers and the Continuous Mapping Theorem, we have
[TABLE]
Again by the Law of Large Numbers,
[TABLE]
Therefore, to complete the proof of , we need to establish that
[TABLE]
Since then these criteria are the same, so we show that the first one holds. For , we have and hence . By Lemma 3.12,
[TABLE]
for , so it follows that .
As for , the proof is similar. By the Cauchy-Schwarz inequality,
[TABLE]
Applying the Law of Large Numbers and the Continuous Mapping Theorem, we obtain and
[TABLE]
Thus, to complete the proof of , we need to establish that
[TABLE]
which are identical criteria. Since , it suffices to show that
[TABLE]
However, so so, by Lemma 3.12,
[TABLE]
for all . Therefore, for all .
Since , we conclude that for all . By Slutsky’s theorem, and therefore Hence, by (3.69), , for .
To establish (3.51), define for and . Then it is straightforward to verify that
[TABLE]
and therefore
[TABLE]
By the Law of Large Numbers and the Continuous Mapping theorem, . Since then , ; also, are i.i.d.
We now show that . First,
[TABLE]
By the triangle inequality,
[TABLE]
Therefore, it suffices to show that and are finite.
Applying the sub-multiplicative property of the Frobenius norm, and the inequality (3.17), we have
[TABLE]
; therefore,
[TABLE]
By Lemma 3.12, for . Since , , then the same holds for . Therefore, it follows that for all .
To show that , , we observe that is a polynomial in and therefore its expectation is finite since the moment-generating function of exists.
Next, we vectorize the matrices and denote the corresponding vectors by . Then, are i.i.d. zero-mean random vectors with finite covariance matrices. By the multivariate Central Limit Theorem, converges in distribution to a multivariate normal random vector. Define
[TABLE]
for ; we regard as a random element in . Since is a continuous function, it follows from the Continuous Mapping theorem that converges to a random element in and also that
[TABLE]
converges in distribution to a random variable that has finite variance. Since , by (3.70) then, by Slutsky’s theorem, we obtain ; therefore .
To establish (3.52), we observe that
[TABLE]
Substituting the now-familiar explicit formula for from (3.43), we obtain
[TABLE]
and as we have seen before, the latter integral is finite.
Now, we observe that
[TABLE]
By the multivariate Central Limit Theorem, converges in distribution to a multivariate normal random vector; and by the Law of Large Numbers for random vectors, . By Slutsky’s theorem, , and so . Hence, by the Continuous Mapping Theorem,
[TABLE]
and so .
Finally, by the Continuous Mapping Theorem in ([16, p. 67], [10, p. 31]), , i.e.,
[TABLE]
The proof now is complete. ∎
3.3 Eigenvalues and eigenfunctions of the covariance operator
The covariance operator of the random element is defined for and by
[TABLE]
where is the covariance function defined in equation (3.9). Let be the positive eigenvalues, listed in non-increasing order according to their multiplicities, of ; also, let be i.i.d. -distributed random variables. It is well-known that the integrated squared process, , has the same distribution as . This result follows from the Karhunen-Loéve expansion of the Gaussian random field ; see Le Maître and Knio [48, Chapter 2] or Vakhania [68, p. 58]. Therefore, the limiting null distribution of is the same as . Let us also denote by , , an enumeration, listed in non-increasing order, of the distinct values of the eigenvalues . Further, we denote by the corresponding multiplicities of the distinct eigenvalues . Then, converges in distribution to , where are i.i.d. -distributed random variables.
For , define
[TABLE]
the first term in the covariance function defined in equation (3.9); by (3.60) and (3.61),
[TABLE]
We will first find the eigenvalues and eigenfunctions of the integral operator , defined for and in by
[TABLE]
Recall that and . Throughout the remainder of this work, we use the notation
[TABLE]
We also set
[TABLE]
for ranging over all partitions, and
[TABLE]
Theorem 3.13**.**
The collection , where ranges over the set of all partitions, is a complete enumeration of the eigenvalues and eigenfunctions, respectively, of the oerator . Further, the eigenfunctions , for ranging over all partitions, form an orthonormal basis in , and is positive and of trace-class.
Proof.
Recall from [53, p. 290, Problem 7.21] the Poisson kernel: For and ,
[TABLE]
In this expansion, set
[TABLE]
so that . Note that satisfies the quadratic equation
[TABLE]
and also that this equation is equivalent to the identity
[TABLE]
In (3.77), also set
[TABLE]
Then,
[TABLE]
Applying (3.75),(3.76) and (3.78)-(3.80) to (3.77), and substituting the result in (3.71), we obtain for , the pointwise convergent series expansion,
[TABLE]
By (2.22), the generalized Laguerre polynomials form an orthonormal system; then it is straightforward to verify that the system also is orthonormal in , for ranging over all partitions, i.e.,
[TABLE]
Now we verify that the series (3.81) converges in the separable tensor product Hilbert space . By the Cauchy criterion, it suffices to prove that for each , there exists such that
[TABLE]
for all such that . By squaring the integrand, it suffices by Fubini’s theorem to consider
[TABLE]
Since the system is orthonormal, the latter sum reduces to
[TABLE]
where represents the number of partitions of into at most parts. It is well-known that
[TABLE]
Therefore, is a convergent series. Since every convergent series in any metric space is Cauchy, it follows that for each , there exists such that , for all such that . Therefore, the series (3.81) is Cauchy in and hence,
[TABLE]
By Fubini’s theorem, the latter expression equals
[TABLE]
It follows from the orthonormality, (3.82), of the system that, for and partition such that ,
[TABLE]
[TABLE]
By the Cauchy-Schwarz inequality, this latter expression is bounded by
[TABLE]
By the orthonormality property (3.82) and the fact that is a probability distribution, the second term in (3.85) equals 1; therefore,
[TABLE]
Since is arbitrary, we now let . By (3.83), the right-hand side of (3.86) converges to [math], so we obtain
[TABLE]
which proves that , for -almost every . Therefore, is an eigenvalue of with corresponding eigenfunction .
Since the kernel is symmetric in , it follows that is symmetric. To show that is positive, we observe that for ,
[TABLE]
Substituting for from (3.72), we obtain
[TABLE]
Applying Fubini’s theorem to reverse the order of the integration, we find that the inner integrals with respect to and are complex conjugates of each other; therefore,
[TABLE]
which is positive. Thus, is positive.
Next, we prove that is of trace-class. For , , it again follows by (3.72) and Fubini’s theorem that
[TABLE]
Denote by the integral operator,
[TABLE]
. By (2.38), and therefore
[TABLE]
for . By [72, p. 93, Theorem 8.8], it follows that is a Hilbert-Schmidt operator. Now, we can write (3.3) as
[TABLE]
, which proves that is of trace-class.
To complete the proof, we now show that the set is complete. It is sufficient to show that if with for all partitions , then -almost everywhere. First, we note that
[TABLE]
by the Cauchy-Schwarz inequality. Since , the second term on the right-hand side of (3.3) is finite. Taking the limit on both sides of (3.3) as and applying (3.83), we obtain
[TABLE]
Since for all partitions then (3.90) reduces to
[TABLE]
Therefore, by (3.87), we obtain for -almost every ,
[TABLE]
Since the function is continuous for all and fixed and by (2.38),
[TABLE]
for , then by the Dominated Convergence Theorem, the integral on the left-hand side of (3.91) is a continuous function of . If two continuous functions are equal -almost everywhere then they are equal everywhere; hence (3.91) holds for all .
Henceforth, without loss of generality, we assume that is real-valued. Let and denote the positive and negative parts of , respectively. Then, , and are nonnegative, and since then by the Cauchy-Schwarz inequality, and are -integrable. Also, by (3.91),
[TABLE]
. By the Uniqueness Theorem for orthogonally invariant Hankel transforms, Theorem 2.13, we notice that there are only two possible cases. Either
[TABLE]
or
[TABLE]
For the first case, we have and so -almost everywhere. As for the second case, we have
[TABLE]
. By the Uniqueness Theorem for orthogonally invariant Hankel transforms, we obtain and hence -almost everywhere. This proves that the orthonormal set is complete, and therefore, it forms a basis in the separable Hilbert space . ∎
The proof of the following theorem is similar to the proof of Theorem 3.13, and the complete details are provided by Hadjicosta [32].
Theorem 3.14**.**
Let be the covariance operator of the random element defined as
[TABLE]
for all and for all functions in , where is the covariance function defined in equation (3.9). Then, is positive and of trace-class.
Recall here that a non-trivial function is an eigenfunction of if there exists an eigenvalue such that . As is self-adjoint and positive, its eigenvalues are real and nonnegative. In the next result, we find the positive eigenvalues (that are not eigenvalues of ) and corresponding eigenfunctions of the operator , and we will show in Subsection 3.4 that [math] is not an eigenvalue of .
Theorem 3.15**.**
Let with for any partition . Also, denote by , , an enumeration, listed in non-increasing order, of the distinct values of the eigenvalues and define the functions
[TABLE]
Then, the positive eigenvalues of are the positive roots of . The eigenfunction corresponding to an eigenvalue has Fourier-Laguerre expansion
[TABLE]
where , and .
Proof.
Since the set , for ranging over all partitions, is an orthonormal basis for , the eigenfunction corresponding to an eigenvalue can be written as
[TABLE]
We restrict ourselves temporarily to eigenfunctions for which this series is pointwise convergent. Substituting this series into the equation , we obtain
[TABLE]
Substituting the covariance function in the left-hand side of (3.92), writing in terms of , and assuming that we can interchange the order of integration and summation, we obtain
[TABLE]
By Theorem 3.13,
[TABLE]
On writing in terms of , the generalized Laguerre polynomial, applying (2.23) for the Laplace transform of , and making use of (3.78) and (3.79), we obtain
[TABLE]
Again writing in terms of , applying (2.2), and making use of (3.78) and (3.79), we obtain
[TABLE]
In summary, (3.3) reduces to
[TABLE]
By applying (3.94), we obtain the Fourier-Laguerre expansion of with respect to the orthonormal basis ; indeed,
[TABLE]
Similarly, by applying (3.95), we have
[TABLE]
Let
[TABLE]
and
[TABLE]
Combining (3.3)-(3.3), we find that (3.92) reduces to
[TABLE]
and by comparing the coefficients of , we obtain
[TABLE]
for all partitions . Since we have assumed that for any then we can solve the equation for to obtain
[TABLE]
Substituting (3.98) into (3.3), and applying Lemma 2.6, we get
[TABLE]
therefore,
[TABLE]
Similarly, by substituting (3.98) into (3.3) and applying Lemma 2.6, we get
[TABLE]
hence
[TABLE]
Suppose ; then it follows from (3.98) that for all partitions , which implies that , which is a contradiction since is a non-trivial eigenfunction. Hence, and cannot be both equal to 0.
Combining (3.99) and (3.100, and using the fact that and are not both [math], it is straightforward to establish that : If and , then we obtain so . If and , then we obtain and again is true. If and , then we obtain and again is true. Therefore, if is a positive eigenvalue of then it is a positive root of the function .
Conversely, suppose that is a positive root of with for any partition . Define
[TABLE]
where and are real constants that are not both equal to 0 and which satisfy (3.99) and (3.100). That such constants exist can be shown by following a case-by-case argument similar to [65, p. 48]: If , , and , then we can choose to be any non-zero number then set . If , , and , then we can choose to be any non-zero number and then set . If , , and , then we can choose to be any non-zero number and then set . Last, if , , and , then we can choose and to be any non-zero numbers.
Now define, for , the function
[TABLE]
By applying the ratio test, we obtain ; therefore .
We also verify that the series (3.102) converges pointwise. By (2.21) and (3.76),
[TABLE]
. By inequality (2.25),
[TABLE]
. Therefore,
[TABLE]
Thus, to establish the pointwise convergence of the series (3.102), we need to show that
[TABLE]
The convergence of the above series follows from the ratio test.
Next, we justify the interchange of summation and integration in our calculations. By a corollary to Theorem 16.7 in Billingsley [11, p. 224], we need to verify that
[TABLE]
First, we find a bound for . By (2.38), for . Thus, by (3.71),
[TABLE]
By the triangle inequality and by (3.106), we have
[TABLE]
Thus, to prove (3.105), we need to establish that
[TABLE]
By applying the bound (3.103), we see that it suffices to prove that
[TABLE]
and
[TABLE]
As these integrals are finite, the convergence of both series follows from (3.104).
To calculate from (3.102), we follow the same steps as before to obtain
[TABLE]
By the definition (3.101) of , and noting that
[TABLE]
we have
[TABLE]
Therefore, is an eigenvalue of with corresponding eigenfunction . ∎
Remark 3.16**.**
In [33], where we studied goodness-of-fit testing for the gamma distributions, we have conjectured that the eigenvalues of are not eigenvalues of . However, as shown in the next subsection, this is not valid in the case of the Wishart distributions. **
3.4 An interlacing property of the eigenvalues
A difficulty of the eigenvalues is that they have no closed form expression; hence there is no simple formula for , the number of terms in the truncated series that should be used in practice to approximate the asymptotic distribution, , of the test statistic .
Since is of trace-class then, by [13, p. 237, Corollary 3.2], can be calculated by integrating the kernel or by evaluating the sum of all eigenvalues :
[TABLE]
Since also is of trace-class then
[TABLE]
All of these integrals can be evaluated using (2.9) and (2.10), and the resulting sum can be simplified using Lemma 2.6. Consequently, we obtain
[TABLE]
To determine the number of terms in the truncated series that should be used in practice to approximate the asymptotic distribution of , we derive bounds for the eigenvalues in terms of the and then obtain a general formula for as a function of . We refer to the ratio as the th scree ratio for .
Since the operator is compact and positive then the set of all its eigenvalues is countable and contains only nonnegative values [72, Theorem 8.12, p. 98]. The next result shows that the eigenvalues indeed are positive.
Proposition 3.17**.**
The operators and are injective; that is, if and only if , and the same holds for . In particular, [math] is not an eigenvalue of or .
Proof. By linearity, it suffices to assume that . So, suppose that , that is,
[TABLE]
for all . Then for , by Fubini’s theorem,
[TABLE]
By the definition of the covariance function in (3.9),
[TABLE]
By (2.37), (2.14), and Fubini’s theorem, we have
[TABLE]
Also, by (2.4) and (2.9), we have
[TABLE]
and, by (2.10),
[TABLE]
Substituting these results into (3.4) and discarding extraneous factors, we obtain
[TABLE]
Replacing by , we find that (3.111) is equivalent to
[TABLE]
Differentiating both sides of (3.112) with respect to , we obtain
[TABLE]
Since for all , and , then
[TABLE]
Therefore,
[TABLE]
Differentiating both sides of (3.113) with respect to , we find that
[TABLE]
As this latter integral is a Laplace transform, we obtain , -almost everywhere. Also, the same argument may be used in the case of .
Consequently, [math] is not an eigenvalue of . ∎
We now derive an interlacing property of the eigenvalues and . To state this property, denote by , the partitions of all nonnegative integers, listed in increasing lexicographic order, e.g., , , , , , ,
Proposition 3.18**.**
For all , . Further, for , every eigenvalue of is an eigenvalue of with multiplicity , , or .
Proof. Define the kernels and
[TABLE]
where . Also, define on the corresponding integral operators,
[TABLE]
, . Then it follows from (3.9) that .
It is clear that each is self-adjoint and of rank one, i.e., the range of is a one-dimensional subspace of . Also, is self-adjoint, and by following the same steps as in Theorem 3.14, we see that it is positive and compact.
By the same argument as in the proof of Proposition 3.17, we find that the operator is injective; hence, the eigenvalues of are positive.
Denote by , , the eigenvalues of , where , repeated according to their multiplicities. Since is compact, self-adjoint, and injective, and since is self-adjoint and of rank one, it follows from Hochstadt [36] or Dancis and Davis [20] that the eigenvalues of interlace the eigenvalues of , i.e., . Further, by Hochstadt [36], every eigenvalue of multiplicity , , of , where denotes the number of partitions of in at most parts, is an eigenvalue of with multiplicity or .
Since is self-adjoint and of rank one then by applying again Hochstadt’s, or Dancis and Davis’, theorem we find that the eigenvalues of interlace the eigenvalues of , i.e, for all .
Combining the conclusions of the preceding paragraphs, we deduce that , . Further, by Hochstadt [36], we have for , every eigenvalue of is an eigenvalue of with multiplicity , , or . ∎
For , we can now determine a value for such that the th scree ratio of exceeds . Applying the interlacing inequalities for , we obtain , where . Since , we advise that be chosen so that
[TABLE]
This criterion leads to a value for that is readily applicable in the analysis of data. Substituting and the value of from (3.4), we obtain
[TABLE]
For and , which represents accuracy to ten decimal places, we present in Tables 1 and 2 the values of the lower bounds on and for various values of .
As indicated by Tables 1 and 2, fewer eigenvalues appear to be needed to approximate the distribution of as increases. As we show in the following result, which is partly a consequence of the interlacing property of the eigenvalues, all but one of the and converge to [math] as , a result that is consistent with the decreasing values of and in the tables.
Corollary 3.19**.**
As , for all , for all , and .
Proof. By (3.74), . Expanding this expression as a power series in , we obtain
[TABLE]
Therefore, and as . By (3.75), , so it follows that if then .
By Proposition 3.18, , so it follows that as . Since the are nonnegative and listed in non-increasing order then it follows that, as , for all .
Finally, the limiting value of is obtained by taking limits in (3.109). ∎
3.5 An application to financial data
In applying our test to a financial data set, we follow in part an example given by Haff, et al. [34, Example 5.3]. Let us denote by , for the daily closing stock prices of Johnson & Johnson (JNJ), Berkshire Hathaway Inc., Class B (BRK-B), and JPMorgan Chase & Co. (JPM) respectively, from November 26, 2017 to November 23, 2018. If a day were a trading holiday, we repeated the observation of the previous day; thus we had 260 observations in total. Then, we computed the daily logarithmic returns , for and ; graphs of these logarithmic returns are given in Figure 1. Finally, we partitioned the daily logarithmic returns into biweekly periods and calculated the covariance matrix for each biweekly period, resulting in the matrices .
A common assumption in the literature on stochastic volatility models is that the three-dimensional vectors of daily logarithmic returns,
[TABLE]
, are mutually independent and identically distributed from a trivariate normal distribution. If this assumption were valid then the corresponding biweekly covariance matrices would be independent and identically distributed with Wishart distributions. Thus, we will test the hypothesis that the biweekly covariance matrices are Wishart-distributed with degrees-of-freedom, i.e., .
To apply the test statistic to test the hypothesis that the data are drawn from a Wishart distribution with degrees of freedom and unspecified scale matrix , we use an algorithm developed by Koev and Edelman [45] in Matlab [66] to evaluate the Bessel functions of two matrix arguments. Applying that algorithm to the data on the stock prices, we find that the observed value of the test statistic is .
We conducted a simulation study to approximate , the 95th-percentile of the null distribution of . We generated random samples of size from the Wishart distribution with and scale matrix , calculated the value of for each sample, and recorded the 95th-percentile of all 10,000 simulated values of . We repeated this process a total of ten times, finally approximating as the mean of all 10 simulated 95th-percentiles, viz., . Since the observed value of exceeds the critical value then we reject the null hypothesis that the random matrices are Wishart-distributed at the 5% level of significance. Moreover, we derived from our simulation study an approximate P-value of for the test. Therefore, we have strong evidence that the three-dimensional vectors of logarithmic returns, , , do not have a trivariate normal distribution or are not mutually independent.
For an alternative approach to approximating , one can use the limiting null distribution of . For , from (3.4), we obtain the approximation . This requires that we first calculate the (that are not equal to ) and their multiplicities, numerically, using the results of Theorem 3.15, and then we would apply the results of Kotz, et al. [46] to derive the distribution of and carry out the test. We recommend in practice the one-term approximation [46, Eqs. (71), (79)],
[TABLE]
which leads to the explicit expression, , for an approximate critical value of .
As an alternative to calculating , we can apply the interlacing inequalities in Proposition 3.18 to obtain a stochastic upper bound, . If we carry out the test by using the upper bound, , with its exact distribution or a one-term approximation obtained from Kotz, et al. [46, loc. cit.], we will obtain a conservative test of the null hypothesis, i.e., with a level of significance at most 5%.
3.6 Consistency of the test
Before stating the theorem, we provide a lemma which will be helpful for establishing consistency of the test. The proof of the following result is similar to the proof of Lemma 3.9.
Lemma 3.20**.**
For , , and ,
[TABLE]
Theorem 3.21**.**
Let be a sequence of positive-definite, i.i.d. random matrices with mean . Assume also that the p.d.f. of is of the form:
[TABLE]
where is orthogonally invariant. Let denote the level of significance of the test and be the -quantile of the test statistic under . If are not Wishart-distributed then
[TABLE]
Proof. By the definition (3.4) of the test statistic and (3.8), we have
[TABLE]
where . By subtracting and adding the quantity
[TABLE]
inside the squared term, and then expanding the integrand, we obtain
[TABLE]
We begin by proving that the integral (3.118) converges almost surely to [math]. By (3.115), there exists a constant such that
[TABLE]
since the Frobenius norm is sub-multiplicative. By the triangle inequality, we conclude that the integral (3.118) is bounded above by
[TABLE]
By the Cauchy-Schwarz inequality,
[TABLE]
Since , then , so we have
[TABLE]
Moreover, by the Strong Law of Large Numbers and the Continuous Mapping Theorem, , almost surely. Also, again by the Strong Law of Large Numbers, , almost surely. It is elementary to verify that . Since and , we have and so . Therefore, (3.118) converges to 0, almost surely.
Second, we show that (3.6) tends to 0, almost surely. By (2.38), the fact that for , and the triangle inequality, we have
[TABLE]
Further, by the triangle inequality, the absolute value of (3.6) is less than or equal to
[TABLE]
By the Cauchy-Schwarz inequality and the fact that , (3.120) is seen to be less than or equal to
[TABLE]
Following the same argument as for integral (3.118), we conclude that integral (3.6) converges to [math], almost surely.
Since we see that the integral (3.117) equals
[TABLE]
We subtract and add inside the squared term the orthogonally invariant Hankel transform of , i.e., the quantity and expand the integrand. Then we find that (3.117) equals
[TABLE]
By the Strong Law of Large Numbers in [49, p. 189, Corollary 7.10], we conclude that the term (3.121) converges to 0, almost surely.
Next, we show that (3.122) converges to 0, almost surely. By (2.38) and the fact that for , we have
[TABLE]
Therefore, the absolute value of the integral (3.122) is less than or equal to
[TABLE]
where the latter bound follows from the Cauchy-Schwarz inequality. Again, by the Strong Law of Large Numbers in , we conclude that the integral (3.122) converges to 0, almost surely.
We have now shown that
[TABLE]
Denote by the right-hand side of (3.123); then . Suppose that , then
[TABLE]
equivalently, , -almost everywhere. By continuity, we obtain for all . By the Uniqueness Theorem for orthogonally invariant Hankel transforms, it follows that has a Wishart distribution. By Muirhead [53, p. 92, Theorem 3.2.5], has also a Wishart distribution, which contradicts the assumption that does not have a Wishart distribution. Therefore, .
Under , , and therefore , i.e., for any ,
[TABLE]
Thus, for any and , there exists such that
[TABLE]
for all . Let be the -quantile of the test statistic under . Then for all since, by definition, . Therefore, for all . In summary, for any , there exists such that for all , i.e.,
[TABLE]
By (3.123) and (3.124), we have , and therefore . Thus, by Severini [61, p. 340, Corollary 11.3 (i)]), we conclude that . Further,
[TABLE]
Since the distribution function of the constant positive random variable is continuous at 0, we conclude that
[TABLE]
This concludes the proof. ∎
Remark 3.22**.**
We show that the assumption (3.116), made in Theorem 3.21, holds for two alternative distributions.
First, the matrix -distribution [43, Section 4, part (c)] or [38, Eqs. (65), (72)]: Let be a positive-definite random matrix with p.d.f.
[TABLE]
where and . Since is orthogonally invariant then, by Schur’s Lemma, there exists a constant such that .
Last, a linear combination of two Wishart matrices: Let be a positive-definite random matrix with p.d.f.
[TABLE]
where , , and . By [30, Section 4.4], it is known that is equal in distribution to , where and are independent, and . Again, the distribution of is orthogonally invariant, therefore it satisfies (3.116).
4 Contiguous Alternatives to the Null Hypothesis
In this section, we derive the limiting distribution of the test statistic under a sequence of contiguous alternatives.
4.1 Assumptions
For and , let be a triangular array of row-wise independent random matrices. As usual, let , , and let be a probability measure dominated by .
We wish to test the hypothesis
[TABLE]
against the alternative
[TABLE]
We write the Radon-Nikodym derivative of with respect to in the form
[TABLE]
We will need two assumptions in the sequel.
Assumptions 4.1**.**
We assume that:
- (A1)
The functions form a sequence of -integrable functions converging pointwise, -almost everywhere, to a function , and
- (A2)
.
Note that since then we also have , for all . Denote the indicator function of an event by . By applying (A2), we deduce the uniform integrability of :
[TABLE]
By Bauer [9, p. 95, Theorem 2.11.4], the -almost everywhere convergence of to implies the -stochastic convergence of to . Again by Bauer [9, p. 104, Theorem 2.12.4], the uniform integrability of along with the -stochastic convergence of to imply the convergence of in mean square, i.e.,
[TABLE]
and therefore
[TABLE]
Since convergence in mean square implies convergence in mean, we have
[TABLE]
and thus,
[TABLE]
Now, due to the fact that for all , we obtain
[TABLE]
4.2 Examples
In this subsection, we verify that Assumptions 4.1 are valid for a broad collection of sequences of contiguous alternatives.
4.2.1 Wishart alternatives with contiguous scale matrices
Let with and . Then,
[TABLE]
. We equate the Radon-Nikodym derivative to , obtaining
[TABLE]
for . By applying L’Hospital’s rule, we obtain
[TABLE]
for . Next, we find . Define
[TABLE]
the remainder term of the Taylor series expansion of , . Then, by elementary algebraic manipulations, we obtain
[TABLE]
By (4.2), the triangle inequality, and the Lipschitz continuity of the exponential function, we have
[TABLE]
. Therefore,
[TABLE]
It is elementary that and n^{1/2}\big{(}(1+n^{-1/2})^{m\alpha-1}-1\big{)}\to m\alpha-1 as ; therefore, there exists a positive constant such that and \big{|}n^{1/2}\big{(}(1+n^{-1/2})^{m\alpha-1}-1\big{)}\big{|}\leq M for all . Therefore, , , so we obtain
[TABLE]
and this bound does not depend on . By (2.4) and (2.9), the above integral is finite; thus, .
4.2.2 Wishart alternatives with contiguous shape parameters
Let with , . We have
[TABLE]
. Following (4.1), we equate this Radon-Nikodym derivative to , obtaining
[TABLE]
for . Recall the multivariate digamma function
[TABLE]
. Applying L’Hospital’s rule, we obtain
[TABLE]
. To calculate , we apply the binomial expansion, obtaining
[TABLE]
thus,
[TABLE]
Next, the Taylor expansion of for sufficiently large values of is
[TABLE]
where .
After lengthy but straightforward calculations, we obtain
[TABLE]
Next, we substitute the Taylor expansion (4.4) in (4.3) and then take the limit as . By applying L’Hospital’s rule four times then, after some lengthy but straightforward calculations, we obtain
[TABLE]
Thus, is a bounded sequence, and therefore .
4.2.3 Contaminated Wishart models
Consider the contamination model,
[TABLE]
where, as usual, . We note that contaminated Wishart models appear also in the analysis of diffusion tensor images [40].
We have
[TABLE]
for . Following (4.1), we equate this Radon-Nikodym derivative to , obtaining
[TABLE]
for . Thus,
[TABLE]
. Since
[TABLE]
clearly is finite and does not depend on then .
We note also that the model (4.5) is a special case of the contamination model
[TABLE]
where is a probability measure dominated by , and . The preceding calculations can also be done for many choices of .
For example, consider the case in which is the probability measure corresponding to the matrix generalized inverse Gaussian distribution [15] with density function
[TABLE]
, where is the normalizing constant, and are symmetric non-negative definite matrices, and . Then
[TABLE]
where is the normalizing constant of and . By [35, p. 506] and [15, Eq. (2)], we deduce that in the following cases:
- (i)
, ,
- (ii)
, ,
- (iii)
, ,
Therefore, we deduce that the Assumptions 4.1 also hold for broad classes of the model .
4.3 The distribution of the test statistic under contiguous alternatives
Let , ; and denote by and the -fold product probability measures of and , respectively.
Theorem 4.2**.**
Let and , , be a triangular array of positive-definite row-wise i.i.d. random matrices, where , . We assume that the distribution of is , for every . Further, let be a random field with
[TABLE]
. Under the Assumptions 4.1, there exists a centered Gaussian field with sample paths in and the covariance function in (3.9), and a function
[TABLE]
, such that in . Moreover, as ,
[TABLE]
We note that the proof of this theorem and the subsequent results can be obtained by following the approach in [65, pp. 79–91] and Theorem 4.3 in [33]. In order to maintain a relatively self-contained presentation, we provide some of the details here.
Before proceeding to those details, we will present some preliminary results. Consider the log-likelihood ratio,
[TABLE]
From the definition of and , we obtain
[TABLE]
Since if and only if for some , we obtain
[TABLE]
Since if and only if then
[TABLE]
Under the assumption that , we obtain
[TABLE]
as . Therefore, without loss of generality, we shall assume that and for all and (see [65, p. 140, Appendix D.2] or [71, p. 303, Example 6.118]).
The Taylor expansion of order of the function , at is
[TABLE]
with remainder term
[TABLE]
where is a measurable function. Therefore,
[TABLE]
In the following result, we use the notation .
Lemma 4.3**.**
As ,
- (i)
* in -distribution.*
- (ii)
* in -probability.*
- (iii)
* in -probability.*
The proofs of these results are given in [65, pp. 80-83] and in [32]. Combining these three results, we conclude that under ,
[TABLE]
We introduced in Section 3.2 the random field
[TABLE]
. Also, we introduced in Theorem 3.5, the centered random field
[TABLE]
, where
[TABLE]
We proved that there exists a centered Gaussian field with sample paths in and with covariance function given in (3.9) such that, under , and in . For and , it follows from the multivariate Central Limit Theorem that \big{(}\mathcal{Z}_{n,3}(T_{1}),\dotsc,\mathcal{Z}_{n,3}(T_{k})\big{)}^{\prime}\xrightarrow{d}\mathcal{N}_{k}\big{(}0,\Sigma\big{)} under , where \Sigma=\big{(}K(T_{i},T_{j})\big{)}_{1\leq i,j\leq k} is the positive definite matrix with th entry .
Let denote the standard Euclidean norm on . Then, by Lemma 4.3(iii),
[TABLE]
in -probability.
Lemma 4.4**.**
For , define
[TABLE]
and set \boldsymbol{c}=\big{(}c(T_{1}),\dotsc,c(T_{k})\big{)}^{\prime}. Then, under ,
[TABLE]
and
[TABLE]
Proof.
Substituting for in (4.8), applying Assumptions 4.1, and carrying out some straightforward calculations, we obtain
[TABLE]
for . Letting
[TABLE]
then
[TABLE]
To establish (4.9), we will apply the Cramér-Wold device. Then it suffices to establish that for every ,
[TABLE]
Now, let be i.i.d. -distributed random matrices, and define
[TABLE]
Under , has the same distribution as
[TABLE]
. Since , then
[TABLE]
Denote by the variance of . Then,
[TABLE]
By Assumptions 4.1, we obtain and {\mathop{\rm Cov}}\big{(}h_{n}(Y_{1}),h^{2}_{n}(Y_{1})\big{)}<\infty. Thus, as ,
[TABLE]
Similarly, it can be shown that, as ,
[TABLE]
-almost surely. In addition, we notice that
[TABLE]
For every ,
[TABLE]
Also, for every ,
[TABLE]
from which we conclude that as ,
[TABLE]
-almost surely. As the results (4.12) – (4.16) are the sufficient conditions in Pratt’s version of the Dominated Convergence Theorem [31, p. 221, Theorem 5.5], we conclude that as ,
[TABLE]
This result is equivalent to the Lindeberg condition, i.e., for every ,
[TABLE]
Thus, we deduce from the Lindeberg-Feller Central Limit Theorem that
[TABLE]
therefore,
[TABLE]
Note also that and that
[TABLE]
Therefore, (4.11) is proved. Finally, (4.10) follows from (4.3), (4.11), and [10, p. 25, Theorem 4.1]. ∎
Now, we proceed to the proof of Theorem 4.2.
Proof of Theorem 4.2.
By (4.6) and Le Cam’s first lemma (see [65, p. 140, Theorem D.5] or [71, p. 311, Corollary 6.124]), and are mutually contiguous. Also, by (4.10) and Le Cam’s third lemma (see [65, p. 141, Theorem D.6] or [71, p. 329, Corollary 6.139]), under ,
[TABLE]
By [65, p. 138, Theorem D.2] or [71, p. 56, Theorem 5.51], the convergence in distribution of under in implies that is tight in under . Further, since is contiguous to , by [65, p. 139, Theorem D.4] or [71, p. 295, Theorem 6.113 (a)], is tight in under .
By (4.17) and the tightness of in under , we obtain under (see [18, Theorem 2, Example 4]). Moreover, since under and is contiguous to , we have under , . Thus, by Billingsley [10, p. 25, Theorem 4.1], we obtain under .
Finally, by the Continuous Mapping Theorem [10, p. 31, Corollary 1], we have under , i.e.,
[TABLE]
under . The proof now is complete. ∎
5 The Efficiency of the Test
In this Section, we investigate the approximate Bahadur slope of the test statistic under local alternatives. Further, we show the validity of a modified Wieand condition. The proof of Wieand’s condition, under which the Bahadur and Pitman efficiencies agree, remains an open problem. By applying the results of this section, we are able to calculate the approximate asymptotic relative efficiency (ARE) of the proposed test relative to potential alternative tests.
For , let be i.i.d., positive-definite random matrices with unknown distribution . We assume that is indexed by a parameter , for some . We let to represent the null hypothesis and to represent the alternative hypothesis. In Section 3, we showed that is scale-invariant, i.e., it does not depend on the unknown scale matrix . Thus, under the null hypothesis , we assume that are i.i.d., positive-definite -distributed random matrices and under the local alternatives, represented by , we assume that are i.i.d., positive-definite -distributed random matrices.
The Radon-Nikodym derivative of with respect to is . We assume that as , the function converges to some function in mean square, i.e.,
[TABLE]
Since , we have
[TABLE]
for . Further, we shall assume that for ,
[TABLE]
5.1 The approximate Bahadur slope of the test
For a description of the approximate Bahadur slope of a test under local alternatives and for the definition of a standard sequence, we refer to Bahadur [5, 6], Taherizadeh [65, Chapter 5] or to Section 5 in [33].
We have the following result for the test statistic .
Theorem 5.1**.**
The sequence of test statistics is a standard sequence. Further, , the inverse of the largest eigenvalue of the covariance operator ,
[TABLE]
and
[TABLE]
Proof. The proof of this theorem follows along the lines of the proof of Theorem 5.1 in [33]. For completeness, we provide the details here.
First, we will establish that is a standard sequence. In Section 3, we showed that the limiting null distribution of the test statistic is the same as that of , where , is an enumeration, listed in non-increasing order, of the distinct eigenvalues of with corresponding multiplicities , and are i.i.d. -distributed random variables. From the Monotone Convergence Theorem, we have
[TABLE]
which is finite since is of trace-class. Thus, is almost surely a positive random variable with continuous probability distribution function.
By Zolotarev [73],
[TABLE]
where . Therefore,
[TABLE]
which converges to as .
By assumption (5.3), for ,
[TABLE]
From the proof of Theorem 3.21, we have
[TABLE]
Since then, by (5.2),
[TABLE]
and then it follows that , the function defined in (5.4). Therefore, in -probability, so the sequence of test statistics is a standard sequence.
Finally, we find the limiting approximate Bahadur slope, as . By applying the Cauchy-Schwarz inequality, (2.38), and assumption (5.1), it is straightforward to establish that
[TABLE]
Therefore,
[TABLE]
The proof is now complete. ∎
5.2 A modified form of Wieand’s condition
Wieand [69] showed that if two standard sequences of test statistics satisfy an additional condition, now called the Wieand condition, then the limiting approximate Bahadur efficiency is in accord with the limiting Pitman efficiency, as the level of significance decreases to [math]. For a description about Pitman’s asymptotic relative efficiency, we refer to Taherizadeh [65, Chapter 5] or to Section 5 in [33]. Although the proof of Wieand’s condition remains an open problem in the matrix setting, we show that a modified form of Wieand’s condition is valid for the test statistics .
Theorem 5.2**.**
There exists a constant such that for any and , there exists a constant such that
[TABLE]
for any and .
Proof. For and , consider the orthogonally invariant Hankel transform,
[TABLE]
We have
[TABLE]
By adding and subtracting the term inside the squared term, and then applying Minkowski’s inequality, we obtain
[TABLE]
Now set
[TABLE]
By adding and subtracting the term
[TABLE]
inside the squared term, and then again applying Minkowski’s inequality, we get
[TABLE]
Combining (5.5) and (5.6), we conclude that
[TABLE]
Further, by subtracting and adding the term
[TABLE]
inside the squared term
[TABLE]
and then applying the Cauchy-Schwarz inequality, we obtain
[TABLE]
Next, by (3.115),
[TABLE]
Since
[TABLE]
and since the trace is invariant under cyclic permutations and the Frobenius norm is sub-multiplicative then
[TABLE]
By the Cauchy-Schwarz inequality,
[TABLE]
Since is a positive definite matrix then
[TABLE]
and therefore,
[TABLE]
Therefore,
[TABLE]
and by (5.9), we obtain
[TABLE]
By (5.7), Markov’s inequality, and Fubini’s theorem,
[TABLE]
By (5.8) and (5.10), we see that (5.11) is greater than or equal to
[TABLE]
In Theorem 3.21, we showed that . Further, by (2.38),
[TABLE]
therefore
[TABLE]
Next, we write
[TABLE]
and expand the sum. By the Cauchy-Schwarz inequality, and using the i.i.d. property of , we obtain
[TABLE]
Squaring the above sum and using the fact that are i.i.d., we obtain
[TABLE]
Since and, by (5.3), for , then and
[TABLE]
Thus, the first term in the right-hand side of (5.2) equals
[TABLE]
and the second term equals [math].
Further, by applying the Cauchy-Schwarz inequality, we also find that
[TABLE]
To show that E_{0}\big{(}\operatorname{tr}\big{[}(X_{1}-\alpha I_{m})^{2}\big{]}\big{)}^{2} is finite, we write
[TABLE]
and since , for , it is sufficient to show that and . However,
[TABLE]
by (2.9). By another application of (2.9),
[TABLE]
By assumption (5.1), we conclude that there exists such that
[TABLE]
Therefore, (5.12) can be written as
[TABLE]
for all . Setting C=\big{(}8\alpha^{-2}m^{5/2}\,\tilde{C}\bar{\sigma}+2\big{)}/\epsilon^{2}\gamma then, for all and ,
[TABLE]
The proof now is complete. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1]
- 2[2] Anfinsen, S. N., and Eltoft, T. Application of the matrix-variate Mellin transform to analysis of polarimetric radar images. IEEE Transactions on Geoscience and Remote Sensing , 49 , 2281–2295, 2011.
- 3[3] Anfinsen, S. N., Doulgeris, A. P., and Eltoft, T. Goodness-of-fit tests for multilook polarimetric radar data based on the Mellin transform. IEEE Transactions on Geoscience and Remote Sensing , 49 , 2764–2781, 2011.
- 4[4] Asai, M., Mc Aleer, M., and Yu, J. Multivariate stochastic volatility: a review. Econometric Reviews , 25 , 145–175, 2006.
- 5[5] Bahadur, R. R. Stochastic comparison of tests. Annals of Mathematical Statistics , 31 , 276–295, 1960.
- 6[6] Bahadur, R. R. Some Limit Theorems in Statistics . SIAM, Philadelphia, PA, 1971.
- 7[7] Baringhaus, L., and Taherizadeh, F. Empirical Hankel transforms and their applications to goodness-of-fit tests. Journal of Multivariate Analysis , 101 , 1445–1467, 2010.
- 8[8] Baringhaus, L., and Taherizadeh, F. A K-S type test for exponentiality based on empirical Hankel transforms. Communications in Statistics - Theory and Methods , 42 , 3781–3792, 2013.
