Integral Transform Methods in Goodness-of-Fit Testing, II: The Wishart   Distributions

Elena Hadjicosta; Donald Richards

arXiv:1903.02653·math.ST·March 8, 2019

Integral Transform Methods in Goodness-of-Fit Testing, II: The Wishart Distributions

Elena Hadjicosta, Donald Richards

PDF

TL;DR

This paper develops goodness-of-fit tests for Wishart distributions using Hankel transforms of matrix arguments, with applications to various fields and analysis of test properties.

Contribution

It introduces a new Hankel transform-based method for goodness-of-fit testing of Wishart distributions with theoretical and practical insights.

Findings

01

Derived the null distribution of the test statistic.

02

Proved the test's consistency against broad alternatives.

03

Applied the test to financial data.

Abstract

We initiate the study of goodness-of-fit testing when the data consist of positive definite matrices. Motivated by the recent appearance of the cone of positive definite matrices in numerous areas of applied research, including diffusion tensor imaging, models of the volatility of financial time series, wireless communication systems, and the analysis of polarimetric radar images, we apply the method of Hankel transforms of matrix argument to develop goodness-of-fit tests for Wishart distributions with given shape parameter and unknown scale matrix. We obtain the limiting null distribution of the test statistic and the corresponding covariance operator. We show that the eigenvalues of the operator satisfy an interlacing property, and we apply our test to some financial data. Moreover, we establish the consistency of the test against a large class of alternative distributions and we…

Tables2

Table 1. Table 1: Values of the lower bounds on r 𝑟 r and N 𝑁 N for m = 2 𝑚 2 m=2 .

$α$	2.5	3	5	10	20	50	100
$r$	8	7	6	4	3	3	2
$N$	23	18	14	7	4	4	2

Table 2. Table 2: Values of the lower bounds on r 𝑟 r and N 𝑁 N for m = 3 𝑚 3 m=3 .

$α$	3	4	5	10	20	50	100
$r$	8	7	6	4	3	3	2
$N$	39	29	21	9	5	5	2

Equations1188

Γ_{m} (a) = \int_{R > 0} (det R)^{a - \frac{1}{2} (m + 1)} etr (- R) d R,

Γ_{m} (a) = \int_{R > 0} (det R)^{a - \frac{1}{2} (m + 1)} etr (- R) d R,

\Gamma_{m}(a)=\pi^{m(m-1)/4}\ \prod_{j=1}^{m}\Gamma\Big{(}a-\tfrac{1}{2}(j-1)\Big{)}.

\Gamma_{m}(a)=\pi^{m(m-1)/4}\ \prod_{j=1}^{m}\Gamma\Big{(}a-\tfrac{1}{2}(j-1)\Big{)}.

f (X) = \frac{1}{Γ _{m} ( α )} (det Σ)^{α} (det X)^{α - \frac{1}{2} (m + 1)} etr (- Σ X),

f (X) = \frac{1}{Γ _{m} ( α )} (det Σ)^{α} (det X)^{α - \frac{1}{2} (m + 1)} etr (- Σ X),

[a]_{\kappa}=\prod_{j=1}^{m}\big{(}a-\tfrac{1}{2}(j-1)\big{)}_{k_{j}}.

[a]_{\kappa}=\prod_{j=1}^{m}\big{(}a-\tfrac{1}{2}(j-1)\big{)}_{k_{j}}.

C_{κ} (Y) = C_{κ} (I_{m}) (det Y)^{k_{m}} \int_{O (m)} j = 1 \prod m - 1 (det_{j} (H Y H^{- 1}))^{k_{j} - k_{j + 1}} d H,

C_{κ} (Y) = C_{κ} (I_{m}) (det Y)^{k_{m}} \int_{O (m)} j = 1 \prod m - 1 (det_{j} (H Y H^{- 1}))^{k_{j} - k_{j + 1}} d H,

C_{κ} (I_{m}) = 2^{2∣ κ ∣} ∣ κ ∣! [m /2]_{κ} \frac{\prod _{i < j}^{ℓ (κ)} ( 2 k _{i} - 2 k _{j} - i + j )}{\prod _{i = 1}^{ℓ (κ)} ( 2 k _{i} + ℓ ( κ ) - i )!},

C_{κ} (I_{m}) = 2^{2∣ κ ∣} ∣ κ ∣! [m /2]_{κ} \frac{\prod _{i < j}^{ℓ (κ)} ( 2 k _{i} - 2 k _{j} - i + j )}{\prod _{i = 1}^{ℓ (κ)} ( 2 k _{i} + ℓ ( κ ) - i )!},

(tr Y)^{k} = ∣ κ ∣ = k \sum C_{κ} (Y),

(tr Y)^{k} = ∣ κ ∣ = k \sum C_{κ} (Y),

\int_{O (m)} C_{κ} (H Y H^{'} Z) d H = \frac{C _{κ} ( Y ) C _{κ} ( Z )}{C _{κ} ( I _{m} )} .

\int_{O (m)} C_{κ} (H Y H^{'} Z) d H = \frac{C _{κ} ( Y ) C _{κ} ( Z )}{C _{κ} ( I _{m} )} .

∣ κ ∣ = k \sum C_{κ} (I_{m}) [a]_{κ} = (m a)_{k},

∣ κ ∣ = k \sum C_{κ} (I_{m}) [a]_{κ} = (m a)_{k},

k = 0 \sum \infty \frac{t ^{k}}{k !} ∣ κ ∣ = k \sum C_{κ} (I_{m}) [a]_{k} = (det (I_{m} - t I_{m}))^{- a},

k = 0 \sum \infty \frac{t ^{k}}{k !} ∣ κ ∣ = k \sum C_{κ} (I_{m}) [a]_{k} = (det (I_{m} - t I_{m}))^{- a},

(det (I_{m} - t I_{m}))^{- a} \equiv (1 - t)^{- m a} = k = 0 \sum \infty \frac{t ^{k}}{k !} (m a)_{k},

(det (I_{m} - t I_{m}))^{- a} \equiv (1 - t)^{- m a} = k = 0 \sum \infty \frac{t ^{k}}{k !} (m a)_{k},

\int_{R > 0} C_{κ} (M R) (det R)^{a - \frac{1}{2} (m + 1)} etr (- R Z) d R = [a]_{κ} Γ_{m} (a) (det Z)^{- α} C_{κ} (M Z^{- 1}) .

\int_{R > 0} C_{κ} (M R) (det R)^{a - \frac{1}{2} (m + 1)} etr (- R Z) d R = [a]_{κ} Γ_{m} (a) (det Z)^{- α} C_{κ} (M Z^{- 1}) .

\int_{R > 0} (det R)^{a - \frac{1}{2} (m + 1)} etr (- R Z) d R = Γ_{m} (a) (det Z)^{- a},

\int_{R > 0} (det R)^{a - \frac{1}{2} (m + 1)} etr (- R Z) d R = Γ_{m} (a) (det Z)^{- a},

A_{ν} (Y) = \frac{1}{Γ _{m} ( ν + \frac{1}{2} ( m + 1 ))} k = 0 \sum \infty \frac{( - 1 ) ^{k}}{k !} ∣ κ ∣ = k \sum \frac{1}{[ ν + \frac{1}{2} ( m + 1 ) ] _{κ}} C_{κ} (Y) .

A_{ν} (Y) = \frac{1}{Γ _{m} ( ν + \frac{1}{2} ( m + 1 ))} k = 0 \sum \infty \frac{( - 1 ) ^{k}}{k !} ∣ κ ∣ = k \sum \frac{1}{[ ν + \frac{1}{2} ( m + 1 ) ] _{κ}} C_{κ} (Y) .

A_{ν} (V^{'} V) = \frac{1}{π ^{m^{2} /2} Γ _{m} ( ν + \frac{1}{2} )} \int_{Q^{'} Q < I_{m}} etr (2 i V^{'} Q) (det (I_{m} - Q^{'} Q))^{ν - \frac{1}{2} m} d Q,

A_{ν} (V^{'} V) = \frac{1}{π ^{m^{2} /2} Γ _{m} ( ν + \frac{1}{2} )} \int_{Q^{'} Q < I_{m}} etr (2 i V^{'} Q) (det (I_{m} - Q^{'} Q))^{ν - \frac{1}{2} m} d Q,

\big{|}A_{\nu}(V^{\prime}V)\big{|}\leq\frac{1}{\Gamma_{m}(\nu+\tfrac{1}{2}(m+1))}.

\big{|}A_{\nu}(V^{\prime}V)\big{|}\leq\frac{1}{\Gamma_{m}(\nu+\tfrac{1}{2}(m+1))}.

\displaystyle\big{|}A_{\nu}(V^{\prime}V)\big{|}

\displaystyle\big{|}A_{\nu}(V^{\prime}V)\big{|}

= A_{ν} (0) = \frac{1}{Γ _{m} ( ν + \frac{1}{2} ( m + 1 ))} . \qed

\int_{R > 0} etr (- R Z) A_{ν} (M R) (det R)^{ν} d R = etr (- M Z^{- 1}) (det Z)^{- ν - \frac{1}{2} (m + 1)} .

\int_{R > 0} etr (- R Z) A_{ν} (M R) (det R)^{ν} d R = etr (- M Z^{- 1}) (det Z)^{- ν - \frac{1}{2} (m + 1)} .

\int_{R > 0} etr (- R Z) A_{ν} (Λ R) A_{ν} (M R) (det R)^{ν} d R = (det Z)^{- ν - \frac{1}{2} (m + 1)} etr (- (Λ + M) Z^{- 1}) A_{ν} (- Λ Z^{- 1} M Z^{- 1}) .

\int_{R > 0} etr (- R Z) A_{ν} (Λ R) A_{ν} (M R) (det R)^{ν} d R = (det Z)^{- ν - \frac{1}{2} (m + 1)} etr (- (Λ + M) Z^{- 1}) A_{ν} (- Λ Z^{- 1} M Z^{- 1}) .

_{1} F_{1} (a; b; Y) = k = 0 \sum \infty \frac{1}{k !} ∣ κ ∣ = k \sum \frac{[ a ] _{κ}}{[ b ] _{κ}} C_{κ} (Y) .

_{1} F_{1} (a; b; Y) = k = 0 \sum \infty \frac{1}{k !} ∣ κ ∣ = k \sum \frac{[ a ] _{κ}}{[ b ] _{κ}} C_{κ} (Y) .

_{1} F_{1} (a; b; Y) = etr (Y)_{1} F_{1} (b - a; b; - Y) .

_{1} F_{1} (a; b; Y) = etr (Y)_{1} F_{1} (b - a; b; - Y) .

\Gamma_{m}(\nu+\tfrac{1}{2}(m+1))\int_{R>0}A_{\nu}(MR)(\det R)^{a-\frac{1}{2}(m+1)}\operatorname{etr}(-RZ)\,\hskip 1.0pt{\rm{d}}R\\ =\Gamma_{m}(a)\,(\det Z)^{-a}\,{{}_{1}}F_{1}\big{(}a;\nu+\tfrac{1}{2}(m+1);-MZ^{-1}\big{)}.

\Gamma_{m}(\nu+\tfrac{1}{2}(m+1))\int_{R>0}A_{\nu}(MR)(\det R)^{a-\frac{1}{2}(m+1)}\operatorname{etr}(-RZ)\,\hskip 1.0pt{\rm{d}}R\\ =\Gamma_{m}(a)\,(\det Z)^{-a}\,{{}_{1}}F_{1}\big{(}a;\nu+\tfrac{1}{2}(m+1);-MZ^{-1}\big{)}.

L^{(\gamma)}_{\kappa}(Y)=\big{[}\gamma+\tfrac{1}{2}(m+1)\big{]}_{\kappa}C_{\kappa}(I_{m})\sum\limits_{s=0}^{|\kappa|}\sum_{|\sigma|=s}\binom{\kappa}{\sigma}\frac{C_{\sigma}(-Y)}{[\gamma+\frac{1}{2}(m+1)]_{\sigma}C_{\sigma}(I_{m})},

L^{(\gamma)}_{\kappa}(Y)=\big{[}\gamma+\tfrac{1}{2}(m+1)\big{]}_{\kappa}C_{\kappa}(I_{m})\sum\limits_{s=0}^{|\kappa|}\sum_{|\sigma|=s}\binom{\kappa}{\sigma}\frac{C_{\sigma}(-Y)}{[\gamma+\frac{1}{2}(m+1)]_{\sigma}C_{\sigma}(I_{m})},

L_{κ}^{(γ)} (0) = [γ + \frac{1}{2} (m + 1)]_{κ} C_{κ} (I_{m}) .

L_{κ}^{(γ)} (0) = [γ + \frac{1}{2} (m + 1)]_{κ} C_{κ} (I_{m}) .

L_{κ}^{(γ)} (Y) := (∣ κ ∣! L_{κ}^{(γ)} (0))^{- 1/2} L_{κ}^{(γ)} (Y),

L_{κ}^{(γ)} (Y) := (∣ κ ∣! L_{κ}^{(γ)} (0))^{- 1/2} L_{κ}^{(γ)} (Y),

\frac{1}{\Gamma_{m}\big{(}\gamma+\tfrac{1}{2}(m+1)\big{)}}\int_{Y>0}\mathcal{L}^{(\gamma)}_{\kappa}(Y)\mathcal{L}^{(\gamma)}_{\sigma}(Y)\,(\det Y)^{\gamma}\,\operatorname{etr}(-Y)\hskip 1.0pt{\rm{d}}Y=\begin{cases}1,&\kappa=\sigma\\ 0,&\kappa\neq\sigma\end{cases}.

\frac{1}{\Gamma_{m}\big{(}\gamma+\tfrac{1}{2}(m+1)\big{)}}\int_{Y>0}\mathcal{L}^{(\gamma)}_{\kappa}(Y)\mathcal{L}^{(\gamma)}_{\sigma}(Y)\,(\det Y)^{\gamma}\,\operatorname{etr}(-Y)\hskip 1.0pt{\rm{d}}Y=\begin{cases}1,&\kappa=\sigma\\ 0,&\kappa\neq\sigma\end{cases}.

\int_{Y > 0} etr (- Y Z) (det Y)^{γ} L_{κ}^{(γ)} (Y) d Y = [γ + \frac{1}{2} (m + 1)]_{κ} Γ_{m} (γ + \frac{1}{2} (m + 1)) (det Z)^{- γ - \frac{1}{2} (m + 1)} C_{κ} (I_{m} - Z^{- 1}) .

\int_{Y > 0} etr (- Y Z) (det Y)^{γ} L_{κ}^{(γ)} (Y) d Y = [γ + \frac{1}{2} (m + 1)]_{κ} Γ_{m} (γ + \frac{1}{2} (m + 1)) (det Z)^{- γ - \frac{1}{2} (m + 1)} C_{κ} (I_{m} - Z^{- 1}) .

etr (- Z) L_{κ}^{(γ)} (Z) = \int_{Y > 0} etr (- Y) (det Y)^{γ} C_{κ} (Y) A_{γ} (Z Y) d Y .

etr (- Z) L_{κ}^{(γ)} (Z) = \int_{Y > 0} etr (- Y) (det Y)^{γ} C_{κ} (Y) A_{γ} (Z Y) d Y .

∣ L_{κ}^{(γ)} (Z) ∣ \leq etr (Z) [γ + \frac{1}{2} (m + 1)]_{κ} C_{κ} (I_{m}) .

∣ L_{κ}^{(γ)} (Z) ∣ \leq etr (Z) [γ + \frac{1}{2} (m + 1)]_{κ} C_{κ} (I_{m}) .

\int_{Y > 0}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Integral Transform Methods in Goodness-of-Fit Testing, II: The Wishart Distributions

Elena Hadjicosta and Donald Richards Department of Statistics, Pennsylvania State University, University Park, PA 16802, U.S.A. E-mail address: [email protected].Department of Statistics, Pennsylvania State University, University Park, PA 16802, U.S.A. E-mail address: [email protected]. MSC 2010 subject classifications: Primary 33C10, 62G10; Secondary 15A52, 62G20, 62H15. Key words and phrases. Bahadur slope; Bessel function of matrix argument; contamination model; contiguous alternative; Frobenius norm; Gaussian random field; generalized Laguerre polynomial; goodness-of-fit testing; Hankel transform of matrix argument; Hilbert-Schmidt operator; hypergeometric function of matrix argument; operator norm; Pitman efficiency; Schur’s lemma; Wishart distribution; zonal polynomials.

Abstract

We initiate the study of goodness-of-fit testing when the data consist of positive definite matrices. Motivated by the recent appearance of the cone of positive definite matrices in numerous areas of applied research, including diffusion tensor imaging, models of the volatility of financial time series, wireless communication systems, and the analysis of polarimetric radar images, we apply the method of Hankel transforms of matrix argument to develop goodness-of-fit tests for Wishart distributions with given shape parameter and unknown scale matrix. We obtain the limiting null distribution of the test statistic and the corresponding covariance operator. We show that the eigenvalues of the operator satisfy an interlacing property, and we apply our test to some financial data. Moreover, we establish the consistency of the test against a large class of alternative distributions and we derive the asymptotic distribution of the test statistic under a sequence of contiguous alternatives. We establish the Bahadur and Pitman efficiency properties of the test statistic and we show the validity of a modified Wieand condition.

1 Introduction
2 Wishart Distributions and Hankel Transforms of Matrix Argument
2.1 Preliminary results for the Wishart distributions
2.2 Bessel functions and Laguerre polynomials of matrix argument
2.3 Hankel transforms of matrix argument
2.4 Orthogonally invariant Hankel transforms of matrix argument
3 Goodness-of-Fit Tests for the Wishart Distributions
3.1 The test statistic
3.2 The limiting null distribution of the test statistic
3.2.1 Preliminary details
3.2.2 The proof of the limiting distribution
3.3 Eigenvalues and eigenfunctions of the covariance operator
3.4 An interlacing property of the eigenvalues
3.5 An application to financial data
3.6 Consistency of the test
4 Contiguous Alternatives to the Null Hypothesis
4.1 Assumptions
4.2 Examples
4.2.1 Wishart alternatives with contiguous scale matrices
4.2.2 Wishart alternatives with contiguous shape parameters
4.2.3 Contaminated Wishart models
4.3 The distribution of the test statistic under contiguous alternatives
5 The Efficiency of the Test
5.1 The approximate Bahadur slope of the test
5.2 A modified form of Wieand’s condition

1 Introduction

In this paper, we develop goodness-of-fit tests for the Wishart distributions, extending the results of [7, 65] for the exponential distributions and [32, 33] for the gamma distributions. In recent years, the cone of positive definite matrices has arisen in numerous applications, e.g., diffusion tensor imaging, financial time series, wireless communication systems, and polarimetric radar images; it is these applications that motivate our study of goodness-of-fit tests for probability distributions on the cone.

Positive definite random matrix data have appeared in medical research, specifically in diffusion tensor imaging (DTI), cf. [22, 39, 40, 44, 50, 58, 59, 60]. DTI is a magnetic resonance imaging method that has attracted much interest in the study of brain diseases. DTI is based on the observation that water molecules in vivo are always in motion; by modelling the diffusion of the water molecules at any location by a three-dimensional Brownian motion, the resulting diffusion tensor image is represented by the $3\times 3$ positive definite matrix of the local diffusion process at the given location.

Although DTI is non-invasive, it enables the study of deep brain white-matter fibers. Thus, DTI has been used to study epileptic seizures, Alzheimer’s disease, traumatic brain injuries, aging, white-matter abnormalities, developmental disorders, and psychiatric conditions [54, 57, 55, 52]. DTI has also been used to study the pathology of organ and tissue types such as the breast, cardiac, kidney, lingual, skeletal muscles, and spinal cord [19]. In numerous articles, the Wishart distribution with known degrees-of-freedom and unknown scale matrix has been used to model DTI data [22, 39, 40].

The Wishart distributions with known degrees-of-freedom also arise in stochastic volatility models [4, 27, 47]. In this area, the problem is to estimate the covariance matrix of the joint capital returns on several financial assets, with the goal of predicting future returns, devising portfolio allocations, pricing options, and estimating risk.

The complex Wishart distributions with known degrees-of-freedom arise in the spectral analysis of multivariate Gaussian time series [26], wireless communications [63, 64, 67] and the analysis of polarimetric synthetic aperture radar [2, 3]. These applications are widespread, for the spectral analysis of such time series arises in signal processing, econometrics, meteorology, and polarimetric radar has become an important remote sensing device due to its heightened ability to distinguish between distinct scattering sources. The results to follow can be extended, after making obvious necessary changes, to the complex Wishart distributions [38, p. 488] and even to the Wishart distributions on general symmetric cones [24].

The technical details required to develop goodness-of-fit tests for positive definite matrix data are extensive. Naturally, we will need mathematical analysis on the cone of positive definite matrices [51], the Bessel and Laguerre polynomials of matrix argument and their zonal polynomial expansions [28, 35, 38, 53], and Hankel transforms of matrix argument [35]. Further complications arising from the non-commutative nature of matrix multiplication leads us to impose on the distribution of the sample data an orthogonal invariance condition. In addition, the Frobenius, spectral, and operator norms arise in the matrix case, and numerous inequalities between them will be needed. There is also the surprising appearance of Schur’s lemma, a result well-known in linear algebra but which appears only rarely in statistical inference.

We now describe the results in the paper. Throughout, we will follow as templates the presentations in [7, 33, 65]. In Section 2 we provide some properties of the Wishart distributions, and related results for the Bessel functions, Hankel transforms, confluent hypergeometric functions, and generalized Laguerre polynomials, all of matrix argument. Further, we present uniqueness theorems for the Hankel transform of matrix argument, a Hankel inversion formula, and some limit theorems. After providing results on a generalized hypergeometric function of two matrix arguments, we define the orthogonally invariant Hankel transform and present some of its properties.

In Section 3, we propose an integral-type test statistic $\boldsymbol{T}_{n}^{2}$ for goodness-of-testing for the Wishart distributions. Generalizing the one-dimensional cases [7, 33, 65], the statistic $\boldsymbol{T}_{n}^{2}$ is a squared integral, (3.3), involving the empirical orthogonally invariant Hankel transform. We obtain the asymptotic distribution of $\boldsymbol{T}_{n}^{2}$ under the null hypothesis, proving that $\boldsymbol{T}_{n}^{2}$ converges in distribution to a weighted sum of independent and identically distributed random variables, each having a chi-square distribution with one degree-of-freedom. The coefficients of the weighted sum are the positive eigenvalues of the covariance operator corresponding to a certain zero-mean Gaussian random field. The determination of the multiplicity of the eigenvalues remains an open problem, but we show that the eigenvalues satisfy an interlacing property and we show the usefulness of the interlacing property in an application of the test statistic to financial data. Also, we establish the consistency of the test against a large class of alternative distributions.

In Section 4, we derive the asymptotic distribution of $\boldsymbol{T}_{n}^{2}$ under certain sequences of contiguous alternatives to the null hypothesis. Specifically, we consider Wishart alternatives with varying shape or scale parameters, some classes of contaminated Wishart models in which the contamination distribution is a generalized inverted Gaussian.

Finally, in Section 5, we establish the Bahadur and Pitman efficiency properties of the statistic $\boldsymbol{T}_{n}^{2}$ . We investigate the approximate Bahadur slope of $\boldsymbol{T}_{n}^{2}$ under local alternatives and we show the validity of a modified Wieand condition. A complete extension of Wieand’s condition, under which the Bahadur and Pitman efficiencies coincide, remains an open problem.

2 Wishart Distributions and Hankel Transforms of Matrix Argument

2.1 Preliminary results for the Wishart distributions

Throughout the paper, all needed results on the zonal polynomials and on the special functions of matrix argument are provided by Herz [35], Muirhead [53], or Richards [56], so we will generally conform to the notation in those sources. We denote the zero matrix of any order by [math], the order being always determined by the context;further $I_{m}$ will denote the $m\times m$ identity matrix. We also denote by $\mathbb{R}^{m\times m}$ the space of $m\times m$ (real) matrices, by $\mathcal{S}^{m\times m}$ the space of $m\times m$ symmetric matrices, by $\mathcal{P}_{+}^{m\times m}$ the cone of $m\times m$ positive-definite matrices, and by $O(m)$ the group of $m\times m$ orthogonal matrices. To specify that $Y\in\mathcal{P}_{+}^{m\times m}$ , we usually write $Y>0$ ; more generally, we write $Y_{1}>Y_{2}$ whenever $Y_{1}-Y_{2}>0$ . Further, we denote the trace of $Y$ by $\operatorname{tr}(Y)$ , the determinant of $Y$ by $\det(Y)$ and we write $\operatorname{etr}(Y)$ for $\exp(\operatorname{tr}Y)$ .

The multivariate gamma function is defined by

[TABLE]

for $a\in\mathbb{C}$ , Re $(a)>\frac{1}{2}(m-1)$ ; this integral is well-known to have the explicit formula,

[TABLE]

A $m\times m$ positive-definite random matrix $X$ is said to have a Wishart distribution if its probability density function (p.d.f.) is of the form

[TABLE]

$X>0$ , where $\alpha>\frac{1}{2}(m-1)$ and $\Sigma>0$ . We write $X\sim W_{m}(\alpha,\Sigma)$ whenever (2.1) holds. The parameter $\alpha$ is called the shape parameter and $\Sigma$ is called the scale matrix of $X$ . If $\alpha$ is a half-integer then $2\alpha$ is called the degrees-of-freedom of $X$ . In general, $E(X)=\alpha\Sigma^{-1}$ ; also, if $M$ is a $q\times m$ matrix of rank $q$ , where $q\leq m$ , then $MXM^{\prime}\sim W_{q}(\alpha,(M\Sigma^{-1}M^{\prime})^{-1})$ [53, p. 92, Theorem 3.2.5].

A partition $\kappa=(k_{1},\dotsc,k_{m})$ is a vector of nonnegative integers, listed in non-increasing order. The weight of $\kappa$ is $|\kappa|=k_{1}+\dotsc+k_{m}$ , and the length, $\ell(\kappa)$ , of $\kappa$ is the number of non-zero $k_{j}$ , $j=1,\dotsc,m$ .

For $a\in\mathbb{C}$ and $k=0,1,2,\ldots$ , the shifted factorial is defined as $(a)_{k}=a(a+1)(a+2)\cdots(a+k-1)$ . For any partition $\kappa=(k_{1},\dotsc,k_{m})$ , the partitional shifted factorial is defined as

[TABLE]

For $Y\in\mathcal{S}^{m\times m}$ , we denote by $\det_{j}(Y)$ the $j$ th principal minor of $Y$ , $j=1,\ldots,m$ . For any partition $\kappa$ , the zonal polynomial $C_{\kappa}(Y)$ is defined as

[TABLE]

where $\hskip 1.0pt{\rm{d}}H$ is the normalized Haar measure on $O(m)$ [56, (35.4.2)]. By(2.2), $C_{\kappa}(Y)$ is homogeneous of degree $|\kappa|$ .

It also follows from the invariance of the Haar measure that $C_{\kappa}(HYH^{\prime})=C_{\kappa}(Y)$ for all $H\in O(m)$ and $Y\in\mathcal{S}^{m\times m}$ ; hence, $C_{\kappa}(Y)$ depends only on the eigenvalues of $Y$ and it is a symmetric function of the eigenvalues. Suppose that $Z\in\mathcal{S}^{m\times m}$ and that $Y^{1/2}$ denotes the unique positive definite square root of $Y\in\mathcal{P}_{+}^{m\times m}$ . Since the matrices $Y^{1/2}ZY^{1/2}$ , $YZ$ , and $ZY$ all have the same eigenvalues we will follow a widely-adopted convention, writing $C_{\kappa}(YZ)$ or $C_{\kappa}(ZY)$ for $C_{\kappa}(Y^{1/2}ZY^{1/2})$ ; throughout the paper, we retain this convention for all orthogonally invariant functions of matrix argument.

With the normalization

[TABLE]

the zonal polynomials satisfy the identity,

[TABLE]

$k=0,1,2,\dotsc$ (see [53, Eq. (iii), p. 228] or [56, Eq. (35.4.6)]). Further, for $Y,Z\in\mathcal{S}^{m\times m}$ , the zonal polynomials satisfy the mean-value property [53, p. 243],

[TABLE]

We will also need in the sequel the identity,

[TABLE]

$a\in\mathbb{C}$ , $k=0,1,2,\dotsc$ . This result is established by applying a power series identity,

[TABLE]

$|t|<1$ ; see [38, p. 495, Eq. (143)], [53, p. 259, Eq. (4)]. Writing

[TABLE]

then (2.6) is obtained by comparing the coefficients of $t^{k}$ in (2.7) and (2.8).

The zonal polynomials also satisfy a Laplace transform identity [53, p. 248]: For Re $(a)>\tfrac{1}{2}(m-1)$ , $Z>0$ , and $M\in\mathcal{S}^{m\times m}$ ,

[TABLE]

For $\kappa=0$ , this result reduces to

[TABLE]

from which we confirm that (2.1) is a probability density function [53, p. 61].

2.2 Bessel functions and Laguerre polynomials of matrix argument

The Bessel function of matrix argument, first treated in detail by Herz [35], can be defined in several ways. Let $\nu\in\mathbb{C}$ be such that $-\nu+\tfrac{1}{2}(j-m)\notin\mathbb{N}$ for all $j=1,\ldots,m$ ; these restrictions ensure that $[\nu+\tfrac{1}{2}(m+1)]_{\kappa}\neq 0$ for all partitions $\kappa$ . Following Muirhead [53, Chapter 7], the Bessel function (of the first kind) of order $\nu$ is defined for $Y\in\mathcal{S}^{m\times m}$ as

[TABLE]

We also refer to [24, 28, 38, 56] for further details of these Bessel functions. In particular, the series (2.11) converges absolutely for all $Y\in\mathcal{S}^{m\times m}$ [28, Theorem 6.3].

For Re $(\nu)>\tfrac{1}{2}(m-2)$ , the Bessel function $A_{\nu}$ is also given by Herz’s generalization of the classical Poisson integral [35, Eq. (3.6´)]: For any $m\times m$ matrix $V$ ,

[TABLE]

where $\mathrm{i}\hskip 1.0pt=\sqrt{-1}$ and the integral is with respect to Lebesgue measure on the set $\{Q\in\mathbb{R}^{m\times m}:QQ^{\prime}<I_{m}\}$ . This result leads to an inequality that will arise repeatedly in the sequel.

Lemma 2.1.

For Re $(\nu)>\tfrac{1}{2}(m-2)$ and $V\in\mathbb{R}^{m\times m}$ ,

[TABLE]

Proof. Since $|\operatorname{etr}(2\,\mathrm{i}\hskip 1.0ptV^{\prime}Q)|\leq 1$ then it follows from (2.11) and (2.12) that

[TABLE]

For Re $(\nu)>-1$ , $M$ symmetric, and $Z>0$ , the Bessel function of matrix argument satisfies the Laplace transform identity,

[TABLE]

Indeed, this identity is Herz’s original definition of $A_{\nu}(R)$ [35, Eq. (2.5)].

Herz [35, Eq. (5.8)] also obtained a fundamental generalization of a classical formula known as Weber’s second exponential integral: For Re $(\nu)>-1$ , $m\times m$ symmetric matrices $\Lambda$ and $M$ , and $Z>0$ ,

[TABLE]

Let $a,b\in\mathbb{C}$ where $-b+\tfrac{1}{2}(j+1)\notin\mathbb{N}$ , $j=1,\ldots,m$ . The confluent hypergeometric function of matrix argument is defined, for $Y\in\mathcal{S}^{m\times m}$ , as

[TABLE]

We will make repeated use of Kummer’s formula [35, Eq. (2.8)], [53, p. 265], [56, §35.8]:

[TABLE]

There is a Laplace transform relationship between the Bessel function $A_{\nu}$ and the confluent hypergeometric function ${}_{1}F_{1}$ function [35, p. 489, Eq. (2.11)]: For Re $(a)>\tfrac{1}{2}(m-1)$ , symmetric $M$ , and $Z>0$ ,

[TABLE]

This result can also be proved by expressing $A_{\nu}(MR)$ as a series of zonal polynomials and then applying (2.9) to integrate term-by-term.

Given partitions $\kappa$ and $\sigma$ , we denote by $\binom{\kappa}{\sigma}$ the generalized binomial coefficient [53, pp. 267-269], [56, Eq. (35.6.3)]. For $\gamma>-1$ and $Y\in\mathcal{S}^{m\times m}$ , the (generalized) Laguerre polynomial $L^{(\gamma)}_{\kappa}(Y)$ , corresponding to $\kappa$ , is defined as

[TABLE]

Setting $Y=0$ in (2.19), we obtain

[TABLE]

The normalized (generalized) Laguerre polynomial corresponding to $\kappa$ is defined by

[TABLE]

$Y\in\mathcal{S}^{m\times m}$ . By [53, Theorem 7.6.5], the polynomials $\mathcal{L}^{(\gamma)}_{\kappa}$ are orthonormal with respect to the Wishart distribution $W(\gamma+\tfrac{1}{2}(m+1),I_{m})$ :

[TABLE]

By [53, p. 282], for $\gamma>-1$ and $Z>0$ , there holds the Laplace transform,

[TABLE]

Further, by [53, Theorem 7.6.4, p. 284], for $\gamma>-1$ and $Z\in\mathcal{S}^{m\times m}$ ,

[TABLE]

Lemma 2.2.

Let $Z>0$ and $\gamma>-1$ , then

[TABLE]

Also, for $v\in\mathbb{R}$ , $v>0$ ,

[TABLE]

Proof.

By (2.13) and (2.24),

[TABLE]

Applying (2.9) to evaluate the latter integral, we obtain (2.25).

To establish (2.2), we substitute $Z=vI_{m}$ into (2.23), obtaining

[TABLE]

Differentiating both sides of the latter equation with respect to $v$ and simplifying the outcome, we obtain the stated result. ∎

2.3 Hankel transforms of matrix argument

Throughout the rest of the paper, if $X$ is a random entity, we denote expectation with respect to the distribution of $X$ by $E_{X}$ or simply by $E$ whenever the context is clear.

Let $X>0$ be a random matrix with probability density function $f(X)$ . For Re $(\nu)>\frac{1}{2}(m-2)$ , we define the Hankel transform of order $\nu$ of $X$ as the function

[TABLE]

$T>0$ . The Hankel transform satisfies the following properties:

Lemma 2.3.

For Re $(\nu)>\frac{1}{2}(m-2)$ , ${|\mathcal{H}_{X,\nu}(T)|}\leq 1$ for all $T>0$ , and $\mathcal{H}_{X,\nu}(T)$ is a continuous function of $T$ .

Proof.

By (2.13),

[TABLE]

for all $T,X>0$ . Therefore, by the triangle inequality, $|\mathcal{H}_{X,\nu}(T)|\leq E_{X}(1)=1$ .

Since $A_{\nu}(TX)$ is bounded and continuous in $T>0$ for every fixed $X>0$ , the continuity of $\mathcal{H}_{X,\nu}(T)$ follows by Dominated Convergence. ∎

Example 2.4.

Let $X\sim W_{m}(\alpha,\Sigma)$ , $\alpha>\frac{1}{2}(m-1)$ , $\Sigma>0$ . For $T>0$ , it follows from the definition (2.27) of the Hankel transform that

[TABLE]

Applying (2.18) to calculate this integral, we obtain

[TABLE]

For the case in which $\nu=\alpha-\tfrac{1}{2}(m+1)$ , (2.28) reduces to

[TABLE]

Example 2.5.

Let $Z\sim W_{m}(\alpha,I_{m})$ and $X>0$ be a $m\times m$ random matrix that is independent of $Z$ . For $T>0$ ,

[TABLE]

To prove this result, we again apply (2.27) and the independence of $X$ and $Z$ , obtaining

[TABLE]

Since $A_{\nu}(T^{1/2}ZT^{1/2}X)=A_{\nu}(T^{1/2}XT^{1/2}Z)$ , we have

[TABLE]

Applying Example 2.4, we obtain

[TABLE]

Combining (2.30) and (2.31), we obtain (2.5).

In particular, if $\nu=\alpha-\frac{1}{2}(m+1)$ then, by Kummer’s formula (2.17), we obtain

[TABLE]

the Laplace transform of $X$ .

Throughout the remainder of the paper, if $X$ and $Y$ are random entities we write $X\stackrel{{\scriptstyle d}}{{=}}Y$ whenever $X$ and $Y$ have the same distribution. If $\{X_{n},n\geq 1\}$ , is a sequence of random entities, we write $X_{n}\xrightarrow{d}X$ whenever $X_{n}$ converges in distribution to $X$ .

Theorem 2.6.

(Uniqueness of the Hankel transform).*

Let $X$ and $Y$ be $m\times m$ positive definite random matrices with Hankel transforms $\mathcal{H}_{X,\nu}$ and $\mathcal{H}_{Y,\nu}$ , respectively. Then $\mathcal{H}_{X,\nu}=\mathcal{H}_{Y,\nu}$ if and only if $X\stackrel{{\scriptstyle d}}{{=}}Y$ .*

Proof.

If $X\stackrel{{\scriptstyle d}}{{=}}Y$ then it is clear that $\mathcal{H}_{X,\nu}$ = $\mathcal{H}_{Y,\nu}$ .

Conversely, suppose that $Z\sim W_{m}(\nu+\frac{1}{2}(m+1),I_{m})$ , independently of $X$ and $Y$ . Let $\Psi_{X}$ and $\Psi_{Y}$ be the Laplace transforms of $X$ and $Y$ respectively; then, for all $T>0$ ,

[TABLE]

and therefore

[TABLE]

By Example 2.5,

[TABLE]

and

[TABLE]

for all $T>0$ . Combining (2.32), (2.33) and (2.34), we obtain $\Psi_{X}(T)=\Psi_{Y}(T)$ , for all $T>0$ . By the uniqueness theorem for multivariate Laplace transforms [23, p. 16, Theorem 2.1.9] we conclude that $X\stackrel{{\scriptstyle d}}{{=}}Y$ . ∎

We denote by $L^{2}_{\nu}$ the space of functions $\phi:\mathcal{P}^{m\times m}_{+}\to\mathbb{C}$ such that

[TABLE]

The following inversion theorem is obtained by applying the Hankel inversion theory of Herz [35, Section 3]. We refer to Hadjicosta [32] for the full details.

Theorem 2.7.

(Inversion of the Hankel transform)*.

Let $X>0$ be a random matrix with Hankel transform $\mathcal{H}_{X,\nu}$ , and with a probability density function $f\in L^{2}_{\nu}$ . Then,*

[TABLE]

Theorem 2.8.

(Hankel Continuity). Let $\{X_{n},n\in\mathbb{N}\}$ be a sequence of $m\times m$ positive-definite random matrices with corresponding Hankel transforms $\{\mathcal{H}_{X_{n}},n\in\mathbb{N}\}$ . If there exists a $m\times m$ positive semi-definite random matrix $X$ with Hankel transform $\mathcal{H}_{X}$ such that $X_{n}\xrightarrow{d}X$ then, for each $T>0$ ,

[TABLE]

Conversely, suppose there exists a function $\mathcal{H}:\mathcal{P}_{+}^{m\times m}\rightarrow\mathbb{R}$ such that $\mathcal{H}(T)\to 1$ as $T\to 0$ , $\mathcal{H}$ is continuous at [math], and (2.35) holds. Then $\mathcal{H}$ is the Hankel transform of an $m\times m$ positive semi-definite random matrix $X$ , and $X_{n}\xrightarrow{d}X$ .

Proof.

Suppose that $X_{n}\xrightarrow{d}X$ then, by the Continuous Mapping Theorem for random vectors [61, p. 336], $A_{\nu}(TX_{n})\xrightarrow{d}A_{\nu}(TX)$ as $n\to\infty$ , for all $T>0$ . By (2.13), $A_{\nu}(TX_{n})$ is uniformly bounded for all $n\in\mathbb{N}$ and $T>0$ ; thus, by the Dominated Convergence Theorem, $EA_{\nu}(TX_{n})\to EA_{\nu}(TX)$ as $n\to\infty$ , for all $T>0$ , and therefore (2.35) holds.

Conversely, suppose that $Z\sim W_{m}(\nu+\frac{1}{2}(m+1),I_{m})$ where $Z$ is independent of the sequence $\{X_{n},n\in\mathbb{N}\}$ . Also, let $\Psi_{X_{n}}$ be the Laplace transform of $X_{n}$ . By Example 2.5, we have

[TABLE]

for all $T>0$ . Further, by Lemma 2.3, $|\mathcal{H}_{n}(T^{1/2}ZT^{1/2})|\ \leq 1$ for all $T>0$ . Thus, by the Dominated Convergence Theorem, as $n\to\infty$ ,

[TABLE]

for all $T>0$ . Since $\mathcal{H}$ is continuous at [math] and $\mathcal{H}(0)=1$ then $\Psi(T)$ also is continuous at [math] and $\Psi(0)=1$ . By the continuity theorem for multivariate Laplace transforms [41, p. 63, Theorem 4.3], there is a $m\times m$ positive semi-definite random matrix $X$ whose Laplace transform is $\Psi$ , and $X_{n}\xrightarrow{d}X$ . ∎

The next result constitutes a characterization of the Wishart distributions using the Hankel transform $\mathcal{H}_{X,\nu}$ , where Re( $\nu)>\tfrac{1}{2}(m-2)$ . The result enables the extension, to the Wishart case, of some results of Baringhaus and Taherizadeh [8] on a supremum norm test statistic.

Theorem 2.9.

Let ${X}$ be an $m\times m$ positive-definite random matrix with an orthogonally invariant distribution and Hankel transform $\mathcal{H}_{{X},\nu}$ . If there exist $\epsilon>0$ and $\alpha>\frac{1}{2}(m-1)$ such that for all $T$ satisfying $0<T\leq\epsilon I_{m}$ ,

[TABLE]

then $\widetilde{X}\sim W_{m}(\alpha,I_{m})$ .

We refer the reader to Hadjicosta [32], where three proofs of this result are given. We provide here the third and briefest proof, which uses the principle of analytic continuation.

Proof.

The Hankel transform, $\mathcal{H}_{{X},\nu}(T)$ , of ${X}$ is holomorphic (analytic) in $T$ . Also, the hypergeometric function ${}_{1}F_{1}(\alpha;\nu+\tfrac{1}{2}(m+1);-T)$ is holomorphic in $T$ . Since these two functions agree on the open neighborhood $\{T:0<T<\epsilon I_{m}\}$ then, by analytic continuation, they agree wherever they both are well-defined. Since they both are well-defined everywhere then we conclude that $\mathcal{H}_{{X},\nu}(T)={}_{1}F_{1}(\alpha;\nu+\tfrac{1}{2}(m+1);-T)$ for all $T>0$ . By Example 2.4 and Theorem 2.6, the uniqueness theorem for Hankel transforms, it follows that ${X}\sim W_{m}(\alpha,I_{m})$ . ∎

2.4 Orthogonally invariant Hankel transforms of matrix argument

For $\nu\in\mathbb{C}$ such that $-\nu+\tfrac{1}{2}(j-m)\notin\mathbb{N}$ , for all $j=1,\dotsc,m$ , and $X,Y\in\mathcal{S}^{m\times m}$ , the Bessel function (of the first kind) of order $\nu$ with two matrix arguments is defined as the infinite series

[TABLE]

It is straightforward from (2.5) and (2.11) to see that

[TABLE]

$X,Y\in\mathcal{S}^{m\times m}$ [53, p. 260]. Also, by applying the inequality (2.13) for $A_{\nu}(X)$ , we obtain

[TABLE]

Definition 2.10.

Let $X$ be an $m\times m$ positive-definite random matrix with p.d.f. $f(X)$ . For Re $(\nu)>\frac{1}{2}(m-2)$ and $T>0$ , we define the orthogonally invariant Hankel transform of order $\nu$ of $X$ as the function

[TABLE]

Remark 2.11.

By (2.37) and the definition (2.27) of $\mathcal{H}_{X,\nu}$ , we have

[TABLE]

Further, since $\int_{O(m)}\hskip 1.0pt{\rm{d}}H=1$ , then $\mathcal{\widetilde{H}}_{X,\nu}$ also satisfies the properties in Lemma 2.3.

Let $a$ , $b\in\mathbb{C}$ , where $-b+\tfrac{1}{2}(j-1)\notin\mathbb{N}$ , for all $j=1,\dotsc,m$ . The confluent hypergeometric function of two matrix arguments is defined, for $X,Y\in\mathcal{S}^{m\times m}$ , as the infinite series,

[TABLE]

It is clear from the definition that ${{}_{1}}F_{1}(a;b;X,I_{m})={{}_{1}}F_{1}(a;b;X).$ Similar to (2.37), it follows from (2.5) that for $X,Y\in\mathcal{S}^{m\times m}$ ,

[TABLE]

Example 2.12.

Let $X\sim W_{m}(\alpha,\Sigma)$ where $\alpha>\frac{1}{2}(m-1)$ and $\Sigma>0$ . For $T>0$ , it follows from Example 2.4, (2.40), and (2.42) that

[TABLE]

Theorem 2.13.

(Uniqueness of orthogonally invariant Hankel transforms).*

Let ${X}$ and ${Y}$ be $m\times m$ positive-definite random matrices with orthogonally invariant distributions and orthogonally invariant Hankel transforms $\mathcal{\widetilde{H}}_{{X},\nu}$ and $\mathcal{\widetilde{H}}_{{Y},\nu}$ , respectively. Then $\mathcal{\widetilde{H}}_{{X},\nu}=\mathcal{\widetilde{H}}_{{Y},\nu}$ if and only if ${X}\stackrel{{\scriptstyle d}}{{=}}{Y}$ .*

Proof.

By Eq. (2.37) and the definition of the orthogonally invariant Hankel transform (2.39), we have

[TABLE]

Since the distribution of $\widetilde{X}$ is orthogonally invariant, $X\stackrel{{\scriptstyle d}}{{=}}HXH^{\prime}$ for all $H\in O(m)$ ; therefore, for all $T>0$ ,

[TABLE]

and similarly for $Y$ . By applying Theorem 2.6, the Uniqueness Theorem for Hankel transforms, we deduce the desired result. ∎

3 Goodness-of-Fit Tests for the Wishart Distributions

3.1 The test statistic

Let $X_{1},\dotsc,X_{n}$ be independent, identically distributed (i.i.d.), $m\times m$ positive-definite random matrices, each with probability density function $f(X)$ and positive-definite mean $\mu=E(X_{1})$ . We assume also that the density function of $X_{1}$ is of the form

[TABLE]

where $f_{0}$ is orthogonally invariant.

Lemma 3.1.

Under the assumption (3.1), the distribution of $\mu^{-1/2}X_{1}\mu^{-1/2}$ is orthogonally invariant.

Proof.

Let $\widetilde{Y}=\mu^{-1/2}X_{1}\mu^{-1/2}$ ; then $X_{1}=\mu^{1/2}\widetilde{Y}\mu^{1/2}$ and the Jacobian of the transformation from $X_{1}$ to $\widetilde{Y}$ is $(\det\mu)^{(m+1)/2}$ [53, p. 58]. Therefore, the p.d.f. of $\widetilde{Y}$ is

[TABLE]

Since $f_{0}$ is orthogonally invariant then it follows that $g$ is orthogonally invariant. ∎

Denote by $P$ the distribution of $X_{1}$ . On the basis of the random sample $X_{1},\dotsc,X_{n}$ , we wish to test the null hypothesis, $H_{0}:P\in\{W_{m}(\alpha,\Sigma),\Sigma>0\}$ , against the alternative, $H_{1}:P\cancel{\in}\{W_{m}(\alpha,\Sigma),\Sigma>0\}$ , where $\alpha$ is known.

Since $\Sigma$ is unspecified by $H_{0}$ , the data $X_{1},\dotsc,X_{n}$ cannot be used to construct a test statistic. Thus, with $\bar{X}_{n}=n^{-1}\sum_{j=1}^{n}X_{j}$ denoting the sample mean, define $Y_{j}=\bar{X}_{n}^{-1/2}X_{j}\bar{X}_{n}^{-1/2}$ , for $j=1,\dotsc,n$ . Under $H_{0}$ , the distribution of $Y_{1},\ldots,Y_{n}$ does not depend on $\Sigma$ , so a test statistic can be based on them. Let $P_{0}$ denote the probability measure corresponding to the $W_{m}(\alpha,I_{m})$ distribution. For Re $(\nu)>\frac{1}{2}(m-2)$ , define the empirical orthogonally invariant Hankel transform of order $\nu$ of $Y_{1},\ldots,Y_{n}$ as

[TABLE]

$T>0$ . Further, define the test statistic

[TABLE]

To provide motivation for this test statistic, suppose that $H_{0}$ is valid; then $E(X_{1})=\alpha\Sigma^{-1}$ and, for large $n$ , we can expect that $Y_{j}=\bar{X}_{n}^{-1/2}X_{j}\bar{X}_{n}^{-1/2}\simeq\alpha^{-1}{\Sigma}^{1/2}X_{j}{\Sigma}^{1/2}$ , almost surely. By the Continuous Mapping Theorem, the sequence of random variables $A_{\nu}(T,Y_{j})$ should approximate the i.i.d. sequence $A_{\nu}(T,\alpha^{-1}{\Sigma}^{1/2}X_{j}{\Sigma}^{1/2})$ , $j=1,\dotsc,n$ , for each $T>0$ and for sufficiently large $n$ . Applying to (3.2) the Strong Law of Large Numbers, we can expect that, for large $n$ , $\mathcal{\widetilde{H}}_{n,\nu}(T)\simeq\mathcal{\widetilde{H}}_{\alpha^{-1}\Sigma^{1/2}X_{1}\Sigma^{1/2},\nu}(T)$ , almost surely.

By Example 2.12, we deduce that

[TABLE]

for $T>0$ . Therefore, by Lemma 3.1 and Theorem 2.13, small values of $\boldsymbol{T}_{n,\nu}^{2}$ provide strong evidence in support of $H_{0}$ , and we will reject $H_{0}$ for large values of $\boldsymbol{T}^{2}_{n,\nu}$ .

For the remainder of the paper, we set

[TABLE]

Since $\nu>\frac{1}{2}(m-2)$ then $\alpha>\frac{1}{2}(2m-1)$ . We also denote $\boldsymbol{T}^{2}_{n,\nu}$ and $\mathcal{\widetilde{H}}_{n,\nu}$ by $\boldsymbol{T}^{2}_{n}$ and $\mathcal{\widetilde{H}}_{n}$ , respectively. By Kummer’s formula (2.17), the statistic (3.3) becomes

[TABLE]

This integral represents $\boldsymbol{T}^{2}_{n}$ as a weighted integral of the squared difference between the empirical orthogonally invariant Hankel transform $\mathcal{\widetilde{H}}_{n}$ and its almost sure limit under the null hypothesis.

We now evaluate the test statistic $\boldsymbol{T}_{n}^{2}$ for a given random sample.

Proposition 3.2.

The test statistic (3.4) is a $V$ -statistic of order 2. Specifically,

[TABLE]

where, for $X,Y>0$ ,

[TABLE]

Proof.

After squaring the integrand in (3.4), we see that there are three terms to be computed. First,

[TABLE]

By (2.37) and Fubini’s theorem,

[TABLE]

Writing $A_{\nu}(HTH^{\prime}Y_{j})=A_{\nu}(H^{\prime}Y_{j}HT)$ , $j=1,\ldots,n$ , and applying Herz’s generalization (2.15) of Weber’s second exponential integral, we find that (3.5) equals

[TABLE]

On the right-hand side of (3.6), we replace $H$ by $HK$ and apply the group invariance of the Haar measure and its normalization; then we find that (3.6) reduces to

[TABLE]

Therefore,

[TABLE]

The second term to be calculated is

[TABLE]

Similar to the previous calculation, we use (2.37) to express $A_{\nu}(T,Y_{i})$ as an average over $O(m)$ and apply Fubini’s theorem to reverse the order of integration. The resulting integral is a special case of (2.14), so we conclude that

[TABLE]

The third and last integral, which we evaluate using the gamma integral (2.10) is

[TABLE]

Collecting together the three terms, we obtain the desired result. ∎

3.2 The limiting null distribution of the test statistic

We denote by $L^{2}=L^{2}(P_{0})$ the space of (equivalence classes of) orthogonally invariant Borel measurable functions $f:\mathcal{P}_{+}^{m\times m}\rightarrow\mathbb{C}$ that are square-integrable with respect to the probability measure $P_{0}$ , i.e., for which $\int_{X>0}{|f(X)|^{2}}\,\hskip 1.0pt{\rm{d}}P_{0}(X)<\infty$ . The space $L^{2}$ is a separable Hilbert space when equipped with the inner product

[TABLE]

and the corresponding norm

[TABLE]

$f,g\in L^{2}$ . Moreover, the set of normalized Laguerre polynomials $\{\mathcal{L}_{\kappa}^{(\nu)}\}$ , with $\kappa$ ranging over all partitions, defined in Section 2.2, forms an orthonormal basis for the space $L^{2}$ ; see Herz [35, p. 502, Theorem 4.6] and Constantine [17, Section 3].

We now define the stochastic process

[TABLE]

$T>0$ . We view the random field $\mathcal{Z}_{n}:=\{\mathcal{Z}_{n}(T),T>0\}$ as a random element in $L^{2}$ since, as we now show, its sample paths are in $L^{2}$ .

Lemma 3.3.

The test statistic (3.4) can be written as

[TABLE]

In particular, $||\mathcal{Z}_{n}||^{2}_{L^{2}}\ <\infty$ .

This result follows immediately from (3.2), (3.4), and (3.7).

Remark 3.4.

By [29, Example 1.4] $(Y_{1},\dotsc,Y_{n})$ has a matrix Liouville distribution, of the second kind, that does not depend on $\Sigma$ . Therefore, without loss of generality, we will set $\Sigma=I_{m}$ in deriving the limiting null distribution of $\boldsymbol{T}^{2}_{n}$ .

We also note that, for each $j=1,\ldots,n$ , the matrices $Y_{j}=\bar{X}_{n}^{-1/2}X_{j}\bar{X}_{n}^{-1/2}$ and $Z_{j}=X_{j}^{1/2}\bar{X}_{n}^{-1}X_{j}^{1/2}$ have the same spectrum; this result is proved by verifying that $Y_{j}$ and $Z_{j}$ have the same characteristic polynomial. Consequently,

[TABLE]

$j=1,\dotsc,n$ , so we can replace $Y_{j}$ by $Z_{j}$ in the definition (3.2) of the test statistic.

We now state the main result of this section.

Theorem 3.5.

Let $m\geq 2$ and $X_{1},\dotsc,X_{n}$ be i.i.d. $P_{0}$ -distributed random matrices, where $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ , and let $\mathcal{Z}_{n}:=(\mathcal{Z}_{n}(T),T>0)$ be the random field defined in (3.7). Then, there exists a centered Gaussian field $\mathcal{Z}:=(\mathcal{Z}(T),T>0)$ , with sample paths in $L^{2}$ and with covariance function,

[TABLE]

$S,T>0$ , such that $\mathcal{Z}_{n}\xrightarrow{d}\mathcal{Z}$ in $L^{2}$ as $n\rightarrow\infty$ . Moreover,

[TABLE]

The remainder of this section is devoted to proving Theorem 3.5, so readers who wish to postpone reading the detailed derivation may continue directly to Section 3.3.

3.2.1 Preliminary details

Here, we provide details on the Frobenius norm of a matrix, the Taylor expansion of functions on the space $\mathcal{S}^{m\times m}$ of symmetric matrices, and various preliminary lemmata necessary for the derivation of the asymptotic distribution of $\boldsymbol{T}^{2}_{n}$ .

For $X,Y\in\mathcal{S}^{m\times m}$ , the inner product between $X$ and $Y$ is defined by $\langle X,Y\rangle=\operatorname{tr}(XY),$ and the Frobenius norm of $X$ is defined by $\lVert X\rVert^{2}_{F}=\langle X,X\rangle=\operatorname{tr}(X^{2}).$ By [37, Section 5.6, p. 291], the Frobenius norm satisfies the triangle inequality, $\lVert X+Y\rVert_{F}\leq\lVert X\rVert_{F}+\lVert Y\rVert_{F},$ and moreover, it is sub-multiplicative, $\lVert XY\rVert_{F}\leq\lVert X\rVert_{F}\cdot\lVert Y\rVert_{F}.$

We use the usual notation for Kronecker’s delta, viz., $\delta_{ij}=1$ or [math] for $i=j$ or $i\neq j$ , respectively. For $Z=(z_{ij})\in\mathcal{S}^{m\times m}$ , the gradient operator is the $m\times m$ matrix

[TABLE]

For example, is straightforward to see that $\nabla_{Z}e^{\langle T,Z\rangle}=e^{\langle T,Z\rangle}\,T.$

Let $F:\mathcal{S}^{m\times m}\rightarrow\mathbb{C}$ be a $C^{1}$ function; that is, $F$ is differentiable of order $1$ and its partial derivatives are continuous on $\mathcal{S}^{m\times m}$ . The Taylor expansion of order $1$ of the function $F$ , at $Z_{0}\in\mathcal{S}^{m\times m}$ , is

[TABLE]

where $U=tZ+(1-t)Z_{0}$ , for some $t\in[0,1]$ .

Lemma 3.6.

For $T,Z>0$ ,

[TABLE]

where $M:=HTH^{\prime}$ and $Y:=M^{1/2}ZM^{1/2}$ .

Proof.

By (2.37),

[TABLE]

It is straightforward to verify that the conditions given by Burkill and Burkill [14, p. 289, Theorem 8.72] for interchanging derivatives and integrals are satisfied; therefore,

[TABLE]

Setting $M=HTH^{\prime}$ and $Y=M^{1/2}ZM^{1/2}$ , we have $Z=M^{-1/2}YM^{-1/2}$ . By Maass [51, p. 64], $\nabla_{Z}=M^{1/2}\nabla_{Y}M^{1/2}$ ; therefore,

[TABLE]

since $A_{\nu}(Y)$ is scalar-valued. Combining (3.12) and (3.2.1), we obtain (3.11). ∎

We note that all further interchanges of derivatives and integrals are justifiable by appeal to [14, loc. cit.], so we will perform such interchanges without further citation. Also, various positive constants arise in the following calculations, and we will denote them generically by $c,c_{j},C_{j}$ , $j\geq 1$ .

Lemma 3.7.

Let $Q$ be an $m\times m$ matrix such that $0<QQ^{\prime}<I_{m}$ . Also, let $Y$ be an $m\times m$ positive-definite matrix. Then, there exists a constant $c>0$ such that

[TABLE]

Proof. Since the trace is a linear operator, we have

[TABLE]

where $\nabla_{Y}\otimes Y^{1/2}$ is the Kronecker product of the gradient $\nabla_{Y}$ acting on the matrix $Y^{1/2}$ , and $V_{ij}:=\big{(}\nabla_{Y}\otimes Y^{1/2}\big{)}_{ij}$ is the $(i,j)$ th block matrix in that Kronecker product.

By the Cauchy-Schwarz inequality, and the fact that $QQ^{\prime}<I_{m}$ implies $\operatorname{tr}(QQ^{\prime})\leq m$ , we obtain

[TABLE]

Recall from [12, p. 13] the multi-linear operator norm, ${\left|\kern-1.07639pt\left|\kern-1.07639pt\left|\cdot\right|\kern-1.07639pt\right|\kern-1.07639pt\right|}$ , which we define here in the following context: If $K_{ij}$ denotes the $(i,j)$ th element of a $m\times m$ matrix $K$ and $(V_{ij})_{kl}$ denotes the $(k,l)$ th element of $V_{ij}:=\big{(}\nabla_{Y}\otimes Y^{1/2}\big{)}_{ij}$ , the $(i,j)$ th block in the tensor product $\nabla_{Y}\otimes Y^{1/2}$ , then

[TABLE]

and we define

[TABLE]

Since all norms on a finite-dimensional space are equivalent, there exists a constant $c>0$ such that $\lVert\nabla_{Y}\otimes Y^{1/2}\rVert_{F}\leq 2c\,{\left|\kern-1.07639pt\left|\kern-1.07639pt\left|\nabla_{Y}\otimes Y^{1/2}\right|\kern-1.07639pt\right|\kern-1.07639pt\right|}$ . By [21, p. 262, Eq. (6)], there holds the crucial inequality,

[TABLE]

Hence,

[TABLE]

so we obtain

[TABLE]

Combining (3.2.1) and (3.16), we obtain (3.14). ∎

Lemma 3.8.

For $T,Z>0$ , there exists a constant $C>0$ such that

[TABLE]

Proof.

By Eq. (3.11),

[TABLE]

where $M:=HTH^{\prime}$ and $Y:=M^{1/2}ZM^{1/2}$ . By Minkowski’s inequality for integrals,

[TABLE]

since the Frobenius norm is sub-multiplicative.

By Herz’s generalization, (2.12), of the Poisson integral,

[TABLE]

where $c_{1}>0$ . Therefore,

[TABLE]

Applying Minkowski’s inequality and then using (3.14) to bound the integrand, we obtain

[TABLE]

Combining (3.2.1) and (3.2.1), we obtain

[TABLE]

For $H\in O(m)$ , $\lVert M\rVert_{F}=\lVert HTH^{\prime}\rVert_{F}=\lVert T\rVert_{F}$ and

[TABLE]

Hence,

[TABLE]

which completes the proof. ∎

Lemma 3.9.

For $T,Z_{1},Z_{2}>0$ , there exist constants $C_{1},C_{2}>0$ such that

[TABLE]

Proof.

By (3.11),

[TABLE]

where $Y_{j}:=M^{1/2}Z_{j}M^{1/2}$ , $j=1,2$ , and $M:=HTH^{\prime}$ . Applying (2.12) and interchanging derivatives and integrals, we obtain

[TABLE]

where $\hskip 1.0pt{\rm{d}}\mu(Q):=(\det(I_{m}-Q^{\prime}Q))^{\alpha-\tfrac{1}{2}(2m+1)}\hskip 1.0pt{\rm{d}}Q$ . Therefore,

[TABLE]

Let $\theta_{j}:=2\operatorname{tr}(Y_{j}^{1/2}Q)$ and $N_{j}:=\nabla_{Y_{j}}(\operatorname{tr}QY_{j}^{1/2})$ , $j=1,2$ ; then we observe that

[TABLE]

since $|e^{\mathrm{i}\hskip 1.0pt\theta_{1}}|=1$ . Also, using the identity

[TABLE]

we find that

[TABLE]

By applying the same argument as in Lemma 3.7, we obtain

[TABLE]

so, by the Cauchy-Schwarz inequality and the fact that $Q^{\prime}Q<I_{m}$ implies $\operatorname{tr}(QQ^{\prime})\leq m$ , we obtain

[TABLE]

Since the norms $\|\cdot\|_{F}$ and ${\left|\kern-1.07639pt\left|\kern-1.07639pt\left|\cdot\right|\kern-1.07639pt\right|\kern-1.07639pt\right|}$ are equivalent, there exists $c>0$ such that

[TABLE]

By a result of Del Moral and Niclas [21, Theorem 1.1, Eq. (4)],

[TABLE]

where $\exp(M)=\sum_{j=0}^{\infty}M^{j}/j!$ is the matrix exponential function. Therefore,

[TABLE]

For any $m\times m$ matrices $M_{1}$ and $M_{2}$ , and for any $K$ such that $\lVert K\rVert_{F}=1$ ,

[TABLE]

Now setting $M_{j}=\exp(-tY_{j}^{1/2})$ , $j=1,2$ , we obtain

[TABLE]

Therefore,

[TABLE]

For any $m\times m$ positive-definite matrix $Y$ and for $t\geq 0$ ,

[TABLE]

hence, for $t\geq 0$ , and $j=1,2$ ,

[TABLE]

Therefore, for $\lVert K\rVert_{F}=1$ , the right-hand side of (3.25) is bounded above by

[TABLE]

Define $X(t):=\exp(-tY_{1}^{1/2})$ , $Y(t):=\exp(-tY_{2}^{1/2})$ , and $\psi(t):=X(t)-Y(t)$ , $t\geq 0$ . Notice that

[TABLE]

and

[TABLE]

with $X(0)=Y(0)=I_{m}$ . Then $\psi(t)$ satisfies the inhomogeneous differential equation

[TABLE]

with boundary condition $\psi(0)=0$ . By following the approach of Kågström [42, Section 4], we find that the solution of this differential equation is

[TABLE]

By Minkowski’s inequality and the sub-multiplicative property of the Frobenius norm,

[TABLE]

Using (3.26) to bound both exponential terms in this integrand, we find that

[TABLE]

Assuming that $\lambda_{\min}(Y_{1}^{1/2})\neq\lambda_{\min}(Y_{2}^{1/2})$ , we calculate the latter integral, obtaining

[TABLE]

Combining (3.23)-(3.27), we obtain

[TABLE]

By continuity, this result remains valid for $\lambda_{\min}(Y_{1}^{1/2})=\lambda_{\min}(Y_{2}^{1/2})$ .

Next, it follows from (3.2.1) that

[TABLE]

By the Cauchy-Schwarz inequality,

[TABLE]

and by (3.14),

[TABLE]

Therefore, with $c_{5}=m^{1/2}c_{4}c\,\int_{Q^{\prime}Q<I_{m}}{}\,\hskip 1.0pt{\rm{d}}\mu(Q)<\infty$ , we have derived

[TABLE]

By (3.21), Minkowski’s inequality, and the sub-multiplicative property of the Frobenius norm, we obtain

[TABLE]

Applying the bound (3.28), we find that

[TABLE]

By a result of Wihler [70, Eq. (3.2)],

[TABLE]

Since $M=HTH^{\prime}$ , $Y_{1}=M^{1/2}Z_{1}M^{1/2}$ , and $Y_{2}=M^{1/2}Z_{2}M^{1/2}$ , then we have

[TABLE]

Also, for $j=1,2$ ,

[TABLE]

Combining (3.29)-(3.32), and using the fact that $\hskip 1.0pt{\rm{d}}H$ is normalized, we obtain

[TABLE]

which is identical with (3.20). ∎

Let $X$ be a Wishart-distributed random matrix, $X\sim W_{m}(\alpha,I_{m})$ , and define for $m\times m$ positive definite matrices $T$ the matrix-valued function

[TABLE]

Lemma 3.10.

For $T>0$ ,

[TABLE]

Proof. We will establish this result by the method of Laplace transforms. For $R>0$ , the Laplace transform of the function $(\det T)^{\nu}\,\operatorname{tr}g(T)$ is

[TABLE]

We substitute (3.33) into this integral, interchange the trace and expectation, apply Fubini’s theorem to interchange the expectation and the integral, and verify the validity of interchanging derivatives and integrals; then we obtain

[TABLE]

Applying (2.37) to write $A_{\nu}(T,Z)$ as an average of its single-matrix argument counterpart, and reversing the order of integration, we obtain

[TABLE]

The inner integral with respect to $T$ is precisely the Laplace transform (2.14); substituting the outcome of that calculation into (3.37), we obtain

[TABLE]

Interchanging the gradient and the integral, and then the integral and the trace, noting that

[TABLE]

we find that

[TABLE]

since the trace and the integral commute.

Next, we have

[TABLE]

by interchanging integral and derivative. By [53, p. 279, Eq. (41)],

[TABLE]

differentiating this series term-by-term and evaluating the outcome at $t=1$ , we find that (3.39) equals

[TABLE]

By (2.9), $E\,C_{\kappa}(X)=[\alpha]_{\kappa}C_{\kappa}(I_{m})$ ; therefore, by combining (3.38)-(3.40), we obtain

[TABLE]

It is also known from [53, p. 248] that

[TABLE]

for $\lVert tR^{-1}\rVert<1$ , where $\lVert\cdot\rVert$ denotes the maximum of the absolute values of the eigenvalues of $tR^{-1}$ . Differentiating this series term-by-term with respect to $t$ , we obtain

[TABLE]

now setting $t=\alpha^{-1}$ and comparing the outcome with (3.41), we find that

[TABLE]

Therefore, by (2.10),

[TABLE]

evidently a Laplace transform. Comparing this expression with (3.35) then the conclusion follows from the uniqueness theorem for Laplace transforms. ∎

Lemma 3.11.

For $T>0$ ,

[TABLE]

Proof. Define for $Y>0$ the function

[TABLE]

By (3.33), $g(T)=E\left[\alpha^{-1}X^{1/2}\,\phi(X)\,X^{1/2}\right]$ , where $X\sim W_{m}(\alpha,I_{m})$ . Since the distribution of $X$ is orthogonally invariant, i.e., $X\overset{d}{=}H^{\prime}XH$ for all $H\in O(m)$ , then

[TABLE]

By (3.44),

[TABLE]

By Maass [51, p. 64], $\nabla_{H^{\prime}ZH}=H^{\prime}\nabla_{Z}H$ ; so it follows that

[TABLE]

However, $A_{\nu}(T,H^{\prime}ZH)=A_{\nu}(T,Z)$ for all $H\in O(m)$ ; therefore,

[TABLE]

Substituting this result into (3.2.1) we obtain, for all $H\in O(m)$ ,

[TABLE]

Since $Hg(T)H^{\prime}=g(T)$ for all $H\in O(m)$ then, by Schur’s Lemma [62, p. 315], $g(T)$ is a scalar matrix, i.e., $g(T)=\gamma_{1}I_{m}$ for some scalar $\gamma_{1}$ . By taking traces and by applying (3.34), we obtain

[TABLE]

therefore,

[TABLE]

The proof is now complete. ∎

The final preliminary result needed for the proof of Theorem 3.5 is the following consequence of [43, Lemma 7, Eq. (20)].

Lemma 3.12.

The integrals

[TABLE]

are finite for all $\alpha>\tfrac{1}{2}(m+1)$ . Further, the integral

[TABLE]

is finite for all $\alpha>\tfrac{1}{2}(m+3)$ .

3.2.2 The proof of the limiting distribution

In what follows, we will use for various matrices $V$ the shorthand notation

[TABLE]

Proof of Theorem 3.5. By (3.10), the Taylor expansion of the Bessel function $A_{\nu}(T,Z)$ at $(T,Z_{0})$ is

[TABLE]

where $U=tZ+(1-t)Z_{0}$ , for some $t\in[0,1]$ . Setting $Z=Z_{j}$ and $Z_{0}=\alpha^{-1}X_{j}$ , $j=1,\dotsc,n$ , in (3.46), we have the Taylor expansion of order 1 of $A_{\nu}(T,Z_{j})$ at $(T,\alpha^{-1}X_{j})$ :

[TABLE]

where $U_{j}=tZ_{j}+(1-t)\alpha^{-1}X_{j}$ , for some $t\in[0,1]$ . Define

[TABLE]

then (3.47) reduces to

[TABLE]

Adding and subtracting the term $\langle\alpha^{-1}X_{j}^{1/2}M_{n}X_{j}^{1/2},\nabla A_{\nu}(T,\alpha^{-1}X_{j})\rangle$ on the right-hand side, we obtain

[TABLE]

where the second equality is obtained by permuting terms cyclically in the inner product. For $T>0$ and $X_{j}>0$ , $j=1,\dotsc,n$ , define the function

[TABLE]

We remark that as $X_{1},\ldots,X_{n}$ are i.i.d. then $E_{X_{j}}g(T,X_{j})$ does not depend on $j$ ; hence,

[TABLE]

is a function evaluated earlier; by (3.43),

[TABLE]

Define the random fields $\mathcal{Z}_{n,1}(T)$ , $\mathcal{Z}_{n,2}(T)$ and $\mathcal{Z}_{n,3}(T)$ , $T>0$ , by

[TABLE]

The random fields $\mathcal{Z}_{n,k}$ , $k=1,2,3$ arise as follows. To define $\mathcal{Z}_{n,1}(T)$ , we use the first two terms in (3.2.2). To define $\mathcal{Z}_{n,2}(T)$ , we use the same expression from $\mathcal{Z}_{n,1}(T)$ except that the term $g(T,X_{j})$ is replaced by its expected value $g(T)$ , which is given by (3.43). To define $\mathcal{Z}_{n,3}(T)$ , we replace the term $M_{n}$ in $\mathcal{Z}_{n,2}(T)$ by a constant multiple of $\alpha I_{m}-X_{j}$ , the constant being obtained by applying the Law of Large Numbers to $\bar{X_{n}}^{-1/2}$ . We will show that

[TABLE]

By writing $\mathcal{Z}_{n}$ as

[TABLE]

it will follow that $\mathcal{Z}_{n}\xrightarrow{d}\mathcal{Z}$ in $L^{2}$ (cf. Billingsley [10, p. 25, Theorem 4.1]).

To establish (3.49), define for $T>0$ ,

[TABLE]

$j=1,\dotsc,n$ . Since $X_{j}\sim W_{m}(\alpha,I_{m})$ then $E(X_{j}-\alpha I_{m})=0$ and therefore, since the trace and the expectation are linear operators, we deduce that

[TABLE]

Also, by Example 2.12 and (2.17), we have $E\left[\Gamma_{m}(\alpha)A_{\nu}(T,\alpha^{-1}X_{j})\right]=\operatorname{etr}(-\alpha^{-1}T).$ Therefore, $E(\mathcal{Z}_{n,3,j}(T))=0$ , for all $T>0$ and $j=1,\dotsc,n$ , and it is also clear that $\mathcal{Z}_{n,3,1},\dotsc,\mathcal{Z}_{n,3,n}$ are independent and identically distributed random elements in $L^{2}$ .

We now show that $E(\lVert\mathcal{Z}_{n,3,j}\rVert^{2}_{L^{2}})<\infty$ for $j=1,\dotsc,n$ . We have

[TABLE]

By the Cauchy-Schwarz inequality, $(a+b+c)^{2}\leq 3(a^{2}+b^{2}+c^{2})$ for $a,b,c\in\mathbb{R}$ ; so to prove that $E(\lVert\mathcal{Z}_{n,3,j}\rVert^{2}_{L^{2}})<\infty$ , it suffices to prove that

[TABLE]

and

[TABLE]

To establish (3.54), we apply (2.38) to obtain

[TABLE]

To prove (3.55), write

[TABLE]

therefore, the integral in (3.55) is a constant multiple of

[TABLE]

Since $(\operatorname{tr}(\alpha I_{m}-X_{j}))^{2}$ is a polynomial in $X_{j}$ , its expectation is finite because the moment-generating function of $X$ exists. As for

[TABLE]

again this integral is finite because $(\operatorname{tr}T)^{2}$ is a polynomial and $\operatorname{etr}(-2\alpha^{-1}T)\,\hskip 1.0pt{\rm{d}}P_{0}(T)$ , after normalization, is a Wishart measure. For the same reason, (3.56) is valid.

In summary, for $T>0$ and $j=1,\dotsc,n$ , $\mathcal{Z}_{n,3,1},\dotsc,\mathcal{Z}_{n,3,n}$ are i.i.d. random elements in $L^{2}$ with $E(\mathcal{Z}_{n,3,j}(T))=0$ and $E(\lVert\mathcal{Z}_{n,3,j}\rVert^{2}_{L^{2}})<\infty$ . Therefore, by the Central Limit Theorem in $L^{2}$ ,

[TABLE]

where $\mathcal{Z}:=(\mathcal{Z}(T),T>0)$ is a centered Gaussian random element in $L^{2}$ . Moreover, $\mathcal{Z}$ has the same covariance operator as $\mathcal{Z}_{n,3,1}$ .

It is well-known that the covariance operator of the random element $\mathcal{Z}_{n,3,1}$ is uniquely determined by the covariance function of the random field $\mathcal{Z}_{n,3,1}$ ; cf., Gīkhman and Skorohod [25, pp. 218-219].

We now show that the function $K(S,T)$ in (3.9) is the covariance function of $\mathcal{Z}_{n,3,1}$ . Noting that $E[\mathcal{Z}_{n,3,1}(T)]=0$ for all $T>0$ , we obtain

[TABLE]

By (3.53),

[TABLE]

so the calculation of $K(S,T)$ reduces to evaluating the four terms obtained by expanding the product on the right-hand side of (3.2.2).

The first term in the product in (3.2.2) is

[TABLE]

By (2.15), (2.37), and Fubini’s theorem, we find that this term equals

[TABLE]

Since $\operatorname{etr}(-\alpha^{-1}(HSH^{\prime}+KTK^{\prime}))=\operatorname{etr}(-\alpha^{-1}(S+T))$ , and

[TABLE]

we conclude that the first term equals

[TABLE]

The second term in the product in (3.2.2) is

[TABLE]

We have seen earlier that

[TABLE]

Also, by (2.37),

[TABLE]

Since $\Gamma_{m}(\alpha)\,A_{\nu}(\alpha^{-1}HSH^{\prime}X_{1})={{}_{0}}F_{1}(\alpha;-\alpha^{-1}HSH^{\prime}X_{1})$ then, by [53, p. 442], the expectation $E\left(X_{1}\cdot A_{\nu}(\alpha^{-1}HSH^{\prime}X_{1})\right)$ is a multiple of the expected value of a noncentral Wishart distributed random matrix $W_{m}(\alpha,I_{m},\Omega)$ , where $\Omega=-\alpha^{-1}HSH^{\prime}$ is the matrix of noncentrality parameters. Hence,

[TABLE]

Substituting this result into (3.64), we obtain

[TABLE]

Substituting (3.63) and (3.65) into (3.62), and simplifying the result, we find that the second term equals

[TABLE]

The third term in the product in (3.2.2) is

[TABLE]

which is the same as the second term but with $S$ and $T$ interchanged.

The fourth term in the product in (3.2.2) is

[TABLE]

Using the explicit formula for $g(T)$ from (3.34) and (3.43), we obtain

[TABLE]

By (2.4) and (2.9), it follows that

[TABLE]

Also, using (2.3), we obtain

[TABLE]

Substituting (3.67) and (3.2.2) into (3.66), we deduce that the fourth term equals

[TABLE]

Combining all four terms, we obtain (3.9).

To establish (3.50), we begin by showing that

[TABLE]

converges in distribution to a random variable with finite variance. By the multivariate Central Limit Theorem, $\sqrt{n}\operatorname{vech}(\alpha I_{m}-\bar{X}_{n})$ converges in distribution to a multivariate normal random vector. Also, by the Law of Large Numbers, $\bar{X}_{n}^{-1}\xrightarrow{p}\alpha^{-1}I_{m}$ . Therefore, by Slutsky’s theorem, $\sqrt{n}\operatorname{vech}(M_{n})$ converges in distribution to a multivariate normal random vector, so it follows from the Continuous Mapping Theorem that $\operatorname{tr}\left[(\sqrt{n}\,M_{n})^{2}\right]$ converges in distribution to a random variable which has finite variance.

By the Taylor expansion (3.2.2),

[TABLE]

Define

[TABLE]

By the Cauchy-Schwarz inequality,

[TABLE]

so we will establish (3.50) by proving that $V_{n}\xrightarrow{p}0$ .

By the triangle inequality and the sub-multiplicative property of the Frobenius norm, we have

[TABLE]

Applying (3.20), we obtain

[TABLE]

Also, since $U_{j}=X_{j}^{1/2}[\alpha^{-1}I_{m}+t(\bar{X}_{n}^{-1}-\alpha^{-1}I_{m})]X_{j}^{1/2}$ , $t\in[0,1]$ , then

[TABLE]

Define

[TABLE]

and

[TABLE]

By the Cauchy-Schwarz inequality, $V_{n}\leq V_{n,1}+V_{n,2}$ . Thus, it suffices to show that $V_{n,1},V_{n,2}\xrightarrow{p}0$ .

We first establish that $V_{n,1}\xrightarrow{p}0$ . By the Cauchy-Schwarz inequality,

[TABLE]

By Weyl’s inequality for the smallest eigenvalue of the sum of two symmetric matrices,

[TABLE]

therefore,

[TABLE]

By the Law of Large Numbers and the Continuous Mapping Theorem, we have

[TABLE]

Again by the Law of Large Numbers,

[TABLE]

Therefore, to complete the proof of $V_{n,1}\xrightarrow{p}0$ , we need to establish that

[TABLE]

Since $\lVert T\rVert^{3}_{F}=(\operatorname{tr}T^{2})^{3/2}$ then these criteria are the same, so we show that the first one holds. For $T>0$ , we have $\operatorname{tr}T^{2}\leq(\operatorname{tr}T)^{2}$ and hence $(\operatorname{tr}T^{2})^{3/2}\leq(\operatorname{tr}T)^{3}$ . By Lemma 3.12,

[TABLE]

for $\alpha>\tfrac{1}{2}(m+3)$ , so it follows that $V_{n,1}\xrightarrow{p}0$ .

As for $V_{n,2}\xrightarrow{p}0$ , the proof is similar. By the Cauchy-Schwarz inequality,

[TABLE]

Applying the Law of Large Numbers and the Continuous Mapping Theorem, we obtain $\lVert\bar{X}_{n}^{-1}-\alpha^{-1}I_{m}\rVert_{F}\xrightarrow{p}0$ and

[TABLE]

Thus, to complete the proof of $V_{n,2}\xrightarrow{p}0$ , we need to establish that

[TABLE]

which are identical criteria. Since $\lVert T\rVert^{3}_{F}=(\operatorname{tr}T^{2})^{3/2}$ , it suffices to show that

[TABLE]

However, $\operatorname{tr}T^{2}\leq(\operatorname{tr}T)^{2}$ so $(\operatorname{tr}T^{2})^{3/2}\leq(\operatorname{tr}T)^{3}$ so, by Lemma 3.12,

[TABLE]

for all $\alpha>\tfrac{1}{2}(m+1)$ . Therefore, $V_{n,2}\xrightarrow{p}0$ for all $\alpha>\tfrac{1}{2}(2m-1)$ .

Since $0\leq V_{n}\leq V_{n,1}+V_{n,2}$ , we conclude that $V_{n}\xrightarrow{p}0$ for all $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ . By Slutsky’s theorem, $[\alpha^{-1}\Gamma_{m}(\alpha)]^{2}\operatorname{tr}\left[(\sqrt{n}\,M_{n})^{2}\right]\cdot V_{n}\xrightarrow{d}0;$ and therefore $[\alpha^{-1}\Gamma_{m}(\alpha)]^{2}\operatorname{tr}\left[(\sqrt{n}\,M_{n})^{2}\right]\cdot V_{n}\xrightarrow{p}0.$ Hence, by (3.69), $\lVert\mathcal{Z}_{n}-\mathcal{Z}_{n,1}\rVert_{L^{2}}\xrightarrow{p}0$ , for $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ .

To establish (3.51), define $V_{j}:=g(T,X_{j})-\ g(T)$ for $T>0$ and $j=1,\dotsc,n$ . Then it is straightforward to verify that

[TABLE]

and therefore

[TABLE]

By the Law of Large Numbers and the Continuous Mapping theorem, $\operatorname{tr}(M_{n}^{2})\xrightarrow{p}0$ . Since $g(T)=E[g(T,X_{j})]$ then $E(V_{j})=0$ , $j=1,\dotsc,n$ ; also, $V_{1},\dotsc,V_{n}$ are i.i.d.

We now show that $E_{X_{j}}E_{T}\rVert V_{j}\lVert^{2}_{F}<\infty$ . First,

[TABLE]

By the triangle inequality,

[TABLE]

Therefore, it suffices to show that $E_{X_{j}}E_{T}\lVert g(T,X_{j})\rVert^{2}_{F}$ and $E_{T}\lVert g(T)\rVert^{2}_{F}$ are finite.

Applying the sub-multiplicative property of the Frobenius norm, and the inequality (3.17), we have

[TABLE]

$c>0$ ; therefore,

[TABLE]

By Lemma 3.12, $E_{T}\left[(\operatorname{tr}T^{2})(\lambda_{\min}(T))^{-1}\right]<\infty$ for $\alpha>\tfrac{1}{2}(m+1)$ . Since $X_{j}\sim W_{m}(\alpha,I_{m})$ , $j=1,\dotsc,n$ , then the same holds for $E_{X_{j}}\left[(\operatorname{tr}X^{2}_{j})\,(\lambda_{\min}(X_{j}))^{-1}\right]$ . Therefore, it follows that $E_{X_{j}}E_{T}\lVert g(T,X_{j})\rVert^{2}_{F}<\infty$ for all $\alpha>\tfrac{1}{2}(2m-1)$ .

To show that $E_{T}\lVert g(T)\rVert^{2}_{F}<\infty$ , $T\sim W_{m}(\alpha,I_{m})$ , we observe that $\lVert g(T)\rVert^{2}_{F}=\operatorname{tr}[(g(T))^{2}]$ is a polynomial in $T$ and therefore its expectation is finite since the moment-generating function of $T$ exists.

Next, we vectorize the matrices $V_{1},\dotsc,V_{n}$ and denote the corresponding vectors by $\operatorname{vech}(V_{1}),\dotsc,\operatorname{vech}(V_{n})$ . Then, $\operatorname{vech}(V_{1}),\dotsc,\operatorname{vech}(V_{n})$ are i.i.d. zero-mean random vectors with finite covariance matrices. By the multivariate Central Limit Theorem, $n^{-1/2}\sum_{j=1}^{n}\operatorname{vech}(V_{j})$ converges in distribution to a multivariate normal random vector. Define

[TABLE]

for $T>0$ ; we regard $\mathcal{V}$ as a random element in $L^{2}$ . Since $\lVert\cdot\rVert_{F}$ is a continuous function, it follows from the Continuous Mapping theorem that $\mathcal{V}$ converges to a random element in $L^{2}$ and also that

[TABLE]

converges in distribution to a random variable that has finite variance. Since $\operatorname{tr}(M_{n}^{2})\xrightarrow{p}0$ , by (3.70) then, by Slutsky’s theorem, we obtain $\lVert\mathcal{Z}_{n,1}-\mathcal{Z}_{n,2}\rVert^{2}_{L^{2}}\xrightarrow{d}0$ ; therefore $\lVert\mathcal{Z}_{n,1}-\mathcal{Z}_{n,2}\rVert_{L^{2}}\xrightarrow{p}0$ .

To establish (3.52), we observe that

[TABLE]

Substituting the now-familiar explicit formula for $g(T)$ from (3.43), we obtain

[TABLE]

and as we have seen before, the latter integral is finite.

Now, we observe that

[TABLE]

By the multivariate Central Limit Theorem, $\sqrt{n}\operatorname{vech}(\alpha I_{m}-\bar{X}_{n})$ converges in distribution to a multivariate normal random vector; and by the Law of Large Numbers for random vectors, $\bar{X}_{n}^{-1}\xrightarrow{p}\alpha^{-1}I_{m}$ . By Slutsky’s theorem, $\sqrt{n}(\alpha I_{m}-\bar{X}_{n})(\bar{X}_{n}^{-1}-\alpha^{-1}I_{m})\xrightarrow{d}0$ , and so $\sqrt{n}(\alpha I_{m}-\bar{X}_{n})(\bar{X}_{n}^{-1}-\alpha^{-1}I_{m})\xrightarrow{p}0$ . Hence, by the Continuous Mapping Theorem,

[TABLE]

and so $\lVert\mathcal{Z}_{n,2}-\mathcal{Z}_{n,3}\rVert_{L^{2}}\xrightarrow{p}0$ .

Finally, by the Continuous Mapping Theorem in $L^{2}$ ([16, p. 67], [10, p. 31]), $\lVert\mathcal{Z}_{n}\rVert^{2}_{L^{2}}\xrightarrow{d}\lVert\mathcal{Z}\rVert^{2}_{L^{2}}$ , i.e.,

[TABLE]

The proof now is complete. ∎

3.3 Eigenvalues and eigenfunctions of the covariance operator

The covariance operator $\mathcal{S}:L^{2}\rightarrow L^{2}$ of the random element $\mathcal{Z}$ is defined for $S>0$ and $f\in L^{2}$ by

[TABLE]

where $K(S,T)$ is the covariance function defined in equation (3.9). Let $\{\delta_{k}:k\geq 1\}$ be the positive eigenvalues, listed in non-increasing order according to their multiplicities, of $\mathcal{S}$ ; also, let $\{\chi^{2}_{1k}:k\geq 1\}$ be i.i.d. $\chi^{2}_{1}$ -distributed random variables. It is well-known that the integrated squared process, $\int_{T>0}\mathcal{Z}^{2}(T)\,\hskip 1.0pt{\rm{d}}P_{0}(T)$ , has the same distribution as $\sum_{k=1}^{\infty}\delta_{k}\chi^{2}_{1k}$ . This result follows from the Karhunen-Loéve expansion of the Gaussian random field $\mathcal{Z}(T)$ ; see Le Maître and Knio [48, Chapter 2] or Vakhania [68, p. 58]. Therefore, the limiting null distribution of $\boldsymbol{T}^{2}_{n}$ is the same as $\sum_{k=1}^{\infty}\delta_{k}\chi^{2}_{1k}$ . Let us also denote by $\tilde{\delta}_{k}$ , $k\geq 1$ , an enumeration, listed in non-increasing order, of the distinct values of the eigenvalues $\delta_{k}$ . Further, we denote by $N(\tilde{\delta}_{k})$ the corresponding multiplicities of the distinct eigenvalues $\tilde{\delta}_{k}$ . Then, $\boldsymbol{T}_{n}^{2}$ converges in distribution to $\sum_{k\geq 1}\tilde{\delta}_{k}\chi^{2}_{N(\tilde{\delta}_{k})}$ , where $\{\chi^{2}_{N(\tilde{\delta}_{k})}\}$ are i.i.d. $\chi^{2}_{N(\tilde{\delta}_{k})}$ -distributed random variables.

For $S,T>0$ , define

[TABLE]

the first term in the covariance function defined in equation (3.9); by (3.60) and (3.61),

[TABLE]

We will first find the eigenvalues and eigenfunctions of the integral operator $\mathcal{S}_{0}:L^{2}\rightarrow L^{2}$ , defined for $S>0$ and $f$ in $L^{2}$ by

[TABLE]

Recall that $m\geq 2$ and $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ . Throughout the remainder of this work, we use the notation

[TABLE]

We also set

[TABLE]

for $\kappa$ ranging over all partitions, and

[TABLE]

Theorem 3.13.

The collection $\{(\rho_{\kappa},\textgoth{L}_{\kappa}^{(\nu)})\}$ , where $\kappa$ ranges over the set of all partitions, is a complete enumeration of the eigenvalues and eigenfunctions, respectively, of the oerator $\mathcal{S}_{0}$ . Further, the eigenfunctions $\{\textgoth{L}_{\kappa}^{(\nu)}\}$ , for $\kappa$ ranging over all partitions, form an orthonormal basis in $L^{2}$ , and $\mathcal{S}_{0}$ is positive and of trace-class.

Proof.

Recall from [53, p. 290, Problem 7.21] the Poisson kernel: For $r\in(0,1)$ and $X,Y>0$ ,

[TABLE]

In this expansion, set

[TABLE]

so that $r\in(0,1)$ . Note that $r^{1/2}=1+\tfrac{1}{2}\alpha(1-\beta)$ satisfies the quadratic equation

[TABLE]

and also that this equation is equivalent to the identity

[TABLE]

In (3.77), also set

[TABLE]

Then,

[TABLE]

Applying (3.75),(3.76) and (3.78)-(3.80) to (3.77), and substituting the result in (3.71), we obtain for $S,T>0$ , the pointwise convergent series expansion,

[TABLE]

By (2.22), the generalized Laguerre polynomials $\{\mathcal{L}_{\kappa}^{(\nu)}\}$ form an orthonormal system; then it is straightforward to verify that the system $\{\textgoth{L}_{\kappa}^{(\nu)}\}$ also is orthonormal in $L^{2}$ , for $\kappa$ ranging over all partitions, i.e.,

[TABLE]

Now we verify that the series (3.81) converges in the separable tensor product Hilbert space $L^{2}\otimes L^{2}:=L^{2}(P_{0}\times P_{0})$ . By the Cauchy criterion, it suffices to prove that for each $\epsilon>0$ , there exists $N\in\mathbb{N}$ such that

[TABLE]

for all $l_{1},l_{2}\in\mathbb{N}$ such that $l_{2}\geq l_{1}\geq N$ . By squaring the integrand, it suffices by Fubini’s theorem to consider

[TABLE]

Since the system $\{\textgoth{L}_{\kappa}^{(\nu)}\}$ is orthonormal, the latter sum reduces to

[TABLE]

where $p_{m}(k)$ represents the number of partitions of $k$ into at most $m$ parts. It is well-known that

[TABLE]

Therefore, $\sum_{k=0}^{\infty}b_{\alpha}^{8k}p_{m}(k)$ is a convergent series. Since every convergent series in any metric space is Cauchy, it follows that for each $\epsilon>0$ , there exists $N\in\mathbb{N}$ such that $\sum_{k=l_{1}}^{l_{2}}b_{\alpha}^{8k}p_{m}(k)<\epsilon$ , for all $l_{1},l_{2}\in\mathbb{N}$ such that $l_{2}\geq l_{1}\geq N$ . Therefore, the series (3.81) is Cauchy in $L^{2}\otimes L^{2}$ and hence,

[TABLE]

By Fubini’s theorem, the latter expression equals

[TABLE]

It follows from the orthonormality, (3.82), of the system $\{\textgoth{L}_{\kappa}^{(\nu)}\}$ that, for $l\in\mathbb{N}$ and partition $\sigma$ such that $l\geq|\sigma|$ ,

[TABLE]

By (3.73) and (3.84),

[TABLE]

By the Cauchy-Schwarz inequality, this latter expression is bounded by

[TABLE]

By the orthonormality property (3.82) and the fact that $P_{0}$ is a probability distribution, the second term in (3.85) equals 1; therefore,

[TABLE]

Since $l$ is arbitrary, we now let $l\rightarrow\infty$ . By (3.83), the right-hand side of (3.86) converges to [math], so we obtain

[TABLE]

which proves that $\mathcal{S}_{0}\textgoth{L}_{\sigma}^{(\nu)}(S)=\rho_{\sigma}\textgoth{L}_{\sigma}^{(\nu)}(S)$ , for $P_{0}$ -almost every $S$ . Therefore, $\rho_{\kappa}$ is an eigenvalue of $S_{0}$ with corresponding eigenfunction $\textgoth{L}_{\kappa}^{(\nu)}$ .

Since the kernel $K_{0}(S,T)$ is symmetric in $(S,T)$ , it follows that $\mathcal{S}_{0}$ is symmetric. To show that $\mathcal{S}_{0}$ is positive, we observe that for $f\in L^{2}$ ,

[TABLE]

Substituting for $K_{0}(S,T)$ from (3.72), we obtain

[TABLE]

Applying Fubini’s theorem to reverse the order of the integration, we find that the inner integrals with respect to $S$ and $T$ are complex conjugates of each other; therefore,

[TABLE]

which is positive. Thus, $\mathcal{S}_{0}$ is positive.

Next, we prove that $\mathcal{S}_{0}$ is of trace-class. For $f\in L^{2}$ , $S>0$ , it again follows by (3.72) and Fubini’s theorem that

[TABLE]

Denote by $\mathcal{T}_{0}:L^{2}\rightarrow L^{2}$ the integral operator,

[TABLE]

$T>0$ . By (2.38), $|\Gamma_{m}(\alpha)\ A_{\nu}(T,\alpha^{-1}X)|\,\leq 1$ and therefore

[TABLE]

for $T,X>0$ . By [72, p. 93, Theorem 8.8], it follows that $\mathcal{T}_{0}$ is a Hilbert-Schmidt operator. Now, we can write (3.3) as

[TABLE]

$S>0$ , which proves that $\mathcal{S}_{0}$ is of trace-class.

To complete the proof, we now show that the set $\{\textgoth{L}_{\kappa}^{(\nu)}\}$ is complete. It is sufficient to show that if $f\in L^{2}$ with $\langle f,\textgoth{L}_{\kappa}^{(\nu)}\rangle_{L^{2}}=0$ for all partitions $\kappa$ , then $f=0$ $P_{0}$ -almost everywhere. First, we note that

[TABLE]

by the Cauchy-Schwarz inequality. Since $f\in L^{2}$ , the second term on the right-hand side of (3.3) is finite. Taking the limit on both sides of (3.3) as $l\rightarrow\infty$ and applying (3.83), we obtain

[TABLE]

Since $\langle f,\textgoth{L}_{\kappa}^{(\nu)}\rangle_{L^{2}}=0$ for all partitions $\kappa$ then (3.90) reduces to

[TABLE]

Therefore, by (3.87), we obtain for $P_{0}$ -almost every $X$ ,

[TABLE]

Since the function $\Gamma_{m}(\alpha)A_{\nu}(S,\alpha^{-1}X)$ is continuous for all $X>0$ and fixed $S>0$ and by (2.38),

[TABLE]

for $X,S>0$ , then by the Dominated Convergence Theorem, the integral on the left-hand side of (3.91) is a continuous function of $X$ . If two continuous functions are equal $P_{0}$ -almost everywhere then they are equal everywhere; hence (3.91) holds for all $X>0$ .

Henceforth, without loss of generality, we assume that $f$ is real-valued. Let $f^{+}$ and $f^{-}$ denote the positive and negative parts of $f$ , respectively. Then, $f=f^{+}-f^{-}$ , $f^{+}$ and $f^{-}$ are nonnegative, and since $f\in L^{2}$ then by the Cauchy-Schwarz inequality, $f^{+}$ and $f^{-}$ are $P_{0}$ -integrable. Also, by (3.91),

[TABLE]

$X>0$ . By the Uniqueness Theorem for orthogonally invariant Hankel transforms, Theorem 2.13, we notice that there are only two possible cases. Either

[TABLE]

or

[TABLE]

For the first case, we have $f^{+}=f^{-}=0$ and so $f=0$ $P_{0}$ -almost everywhere. As for the second case, we have

[TABLE]

$X>0$ . By the Uniqueness Theorem for orthogonally invariant Hankel transforms, we obtain $f^{+}=f^{-}$ and hence $f=0$ $P_{0}$ -almost everywhere. This proves that the orthonormal set $\{\textgoth{L}_{\kappa}^{(\nu)}\}$ is complete, and therefore, it forms a basis in the separable Hilbert space $L^{2}$ . ∎

The proof of the following theorem is similar to the proof of Theorem 3.13, and the complete details are provided by Hadjicosta [32].

Theorem 3.14.

Let $\mathcal{S}:L^{2}\rightarrow L^{2}$ be the covariance operator of the random element $\mathcal{Z}$ defined as

[TABLE]

for all $S>0$ and for all functions $f$ in $L^{2}$ , where $K(S,T)$ is the covariance function defined in equation (3.9). Then, $\mathcal{S}$ is positive and of trace-class.

Recall here that a non-trivial function $\phi\in L^{2}$ is an eigenfunction of $\mathcal{S}$ if there exists an eigenvalue $\delta\in\mathbb{C}$ such that $\mathcal{S}\phi=\delta\phi$ . As $\mathcal{S}$ is self-adjoint and positive, its eigenvalues are real and nonnegative. In the next result, we find the positive eigenvalues (that are not eigenvalues of $\mathcal{S}_{0}$ ) and corresponding eigenfunctions of the operator $\mathcal{S}$ , and we will show in Subsection 3.4 that [math] is not an eigenvalue of $\mathcal{S}$ .

Theorem 3.15.

Let $\delta\in\mathbb{R}$ with $\delta\neq\rho_{\kappa}$ for any partition $\kappa$ . Also, denote by $\tilde{\rho}_{k}$ , $k\geq 1$ , an enumeration, listed in non-increasing order, of the distinct values of the eigenvalues $\rho_{\kappa}$ and define the functions

[TABLE]

Then, the positive eigenvalues of $\mathcal{S}$ are the positive roots of $G(\delta)=\alpha^{3}A(\delta)B(\delta)-D^{2}(\delta)$ . The eigenfunction corresponding to an eigenvalue $\delta$ has Fourier-Laguerre expansion

[TABLE]

where $C_{1}C_{2}\neq 0$ , $\alpha^{3}C_{1}A(\delta)=C_{2}D(\delta)$ and $C_{2}B(\delta)=C_{1}D(\delta)$ .

Proof.

Since the set $\{\textgoth{L}_{\kappa}^{(\nu)}\}$ , for $\kappa$ ranging over all partitions, is an orthonormal basis for $L^{2}$ , the eigenfunction $\phi\in L^{2}$ corresponding to an eigenvalue $\delta$ can be written as

[TABLE]

We restrict ourselves temporarily to eigenfunctions for which this series is pointwise convergent. Substituting this series into the equation $\mathcal{S}\phi=\delta\phi$ , we obtain

[TABLE]

Substituting the covariance function $K(S,T)$ in the left-hand side of (3.92), writing $K$ in terms of $K_{0}$ , and assuming that we can interchange the order of integration and summation, we obtain

[TABLE]

By Theorem 3.13,

[TABLE]

On writing $\textgoth{L}_{\kappa}^{(\nu)}$ in terms of $L_{\kappa}^{(\nu)}$ , the generalized Laguerre polynomial, applying (2.23) for the Laplace transform of $L_{\kappa}^{(\nu)}$ , and making use of (3.78) and (3.79), we obtain

[TABLE]

Again writing $\textgoth{L}_{\kappa}^{(\nu)}$ in terms of $L_{\kappa}^{(\nu)}$ , applying (2.2), and making use of (3.78) and (3.79), we obtain

[TABLE]

In summary, (3.3) reduces to

[TABLE]

By applying (3.94), we obtain the Fourier-Laguerre expansion of $\operatorname{etr}(-\alpha^{-1}S)$ with respect to the orthonormal basis $\{\textgoth{L}_{\kappa}^{(\nu)}\}$ ; indeed,

[TABLE]

Similarly, by applying (3.95), we have

[TABLE]

Let

[TABLE]

and

[TABLE]

Combining (3.3)-(3.3), we find that (3.92) reduces to

[TABLE]

and by comparing the coefficients of $\textgoth{L}_{\kappa}^{(\nu)}(S)$ , we obtain

[TABLE]

for all partitions $\kappa$ . Since we have assumed that $\delta\neq\rho_{\kappa}$ for any $\kappa$ then we can solve the equation for $\langle\phi,\textgoth{L}_{\kappa}^{(\nu)}\rangle_{L^{2}}$ to obtain

[TABLE]

Substituting (3.98) into (3.3), and applying Lemma 2.6, we get

[TABLE]

therefore,

[TABLE]

Similarly, by substituting (3.98) into (3.3) and applying Lemma 2.6, we get

[TABLE]

hence

[TABLE]

Suppose $C_{1}=C_{2}=0$ ; then it follows from (3.98) that $\langle\phi,\textgoth{L}_{\kappa}^{(\nu)}\rangle_{L^{2}}=0$ for all partitions $\kappa$ , which implies that $\phi=0$ , which is a contradiction since $\phi$ is a non-trivial eigenfunction. Hence, $C_{1}$ and $C_{2}$ cannot be both equal to 0.

Combining (3.99) and (3.100, and using the fact that $C_{1}$ and $C_{2}$ are not both [math], it is straightforward to establish that $\alpha^{3}A(\delta)B(\delta)=D^{2}(\delta)$ : If $C_{1}\neq 0$ and $C_{2}\neq 0$ , then we obtain $\alpha^{3}C_{1}C_{2}A(\delta)B(\delta)=C_{1}C_{2}D^{2}(\delta)$ so $\alpha^{3}A(\delta)B(\delta)=D^{2}(\delta)$ . If $C_{1}=0$ and $C_{2}\neq 0$ , then we obtain $D(\delta)=B(\delta)=0$ and again $\alpha^{3}A(\delta)B(\delta)=D^{2}(\delta)$ is true. If $C_{1}\neq 0$ and $C_{2}=0$ , then we obtain $D(\delta)=A(\delta)=0$ and again $\alpha^{3}A(\delta)B(\delta)=D^{2}(\delta)$ is true. Therefore, if $\delta$ is a positive eigenvalue of $\mathcal{S}$ then it is a positive root of the function $G(\delta)=\alpha^{3}A(\delta)B(\delta)-D^{2}(\delta)$ .

Conversely, suppose that $\delta$ is a positive root of $G(\delta)$ with $\delta\neq\rho_{\kappa}$ for any partition $\kappa$ . Define

[TABLE]

where $C_{1}$ and $C_{2}$ are real constants that are not both equal to 0 and which satisfy (3.99) and (3.100). That such constants exist can be shown by following a case-by-case argument similar to [65, p. 48]: If $D\neq 0$ , $A\neq 0$ , and $B\neq 0$ , then we can choose $C_{2}$ to be any non-zero number then set $C_{1}=C_{2}B/D$ . If $D=0$ , $A=0$ , and $B\neq 0$ , then we can choose $C_{1}$ to be any non-zero number and then set $C_{2}=0$ . If $D=0$ , $A\neq 0$ , and $B=0$ , then we can choose $C_{2}$ to be any non-zero number and then set $C_{1}=0$ . Last, if $D=0$ , $A=0$ , and $B=0$ , then we can choose $C_{1}$ and $C_{2}$ to be any non-zero numbers.

Now define, for $S>0$ , the function

[TABLE]

By applying the ratio test, we obtain $\sum_{k=0}^{\infty}\sum_{|\kappa|=k}\gamma_{\kappa}^{2}<\infty$ ; therefore $\tilde{\phi}\in L^{2}$ .

We also verify that the series (3.102) converges pointwise. By (2.21) and (3.76),

[TABLE]

$S>0$ . By inequality (2.25),

[TABLE]

$S>0$ . Therefore,

[TABLE]

Thus, to establish the pointwise convergence of the series (3.102), we need to show that

[TABLE]

The convergence of the above series follows from the ratio test.

Next, we justify the interchange of summation and integration in our calculations. By a corollary to Theorem 16.7 in Billingsley [11, p. 224], we need to verify that

[TABLE]

First, we find a bound for $K_{0}(S,T)$ . By (2.38), $|\Gamma_{m}(\alpha)A_{\nu}(-\alpha^{-2}S,T)|\ \leq 1$ for $S,T>0$ . Thus, by (3.71),

[TABLE]

By the triangle inequality and by (3.106), we have

[TABLE]

Thus, to prove (3.105), we need to establish that

[TABLE]

By applying the bound (3.103), we see that it suffices to prove that

[TABLE]

and

[TABLE]

As these integrals are finite, the convergence of both series follows from (3.104).

To calculate $\mathcal{S}\tilde{\phi}(S)$ from (3.102), we follow the same steps as before to obtain

[TABLE]

By the definition (3.101) of $\gamma_{\kappa}$ , and noting that

[TABLE]

we have

[TABLE]

Therefore, $\delta$ is an eigenvalue of $\mathcal{S}$ with corresponding eigenfunction $\tilde{\phi}$ . ∎

Remark 3.16.

In [33], where we studied goodness-of-fit testing for the gamma distributions, we have conjectured that the eigenvalues of $\mathcal{S}$ are not eigenvalues of $\mathcal{S}_{0}$ . However, as shown in the next subsection, this is not valid in the case of the Wishart distributions. **

3.4 An interlacing property of the eigenvalues

A difficulty of the eigenvalues $\delta_{k}$ is that they have no closed form expression; hence there is no simple formula for $N$ , the number of terms in the truncated series $\sum_{k=1}^{N}\delta_{k}\chi^{2}_{1k}$ that should be used in practice to approximate the asymptotic distribution, $\sum_{k=1}^{\infty}\delta_{k}\chi^{2}_{1k}$ , of the test statistic $\boldsymbol{T}_{n}^{2}$ .

Since $\mathcal{S}_{0}$ is of trace-class then, by [13, p. 237, Corollary 3.2], $Tr(\mathcal{S}_{0})$ can be calculated by integrating the kernel $K_{0}$ or by evaluating the sum of all eigenvalues $\rho_{\kappa}$ :

[TABLE]

Since $\mathcal{S}$ also is of trace-class then

[TABLE]

All of these integrals can be evaluated using (2.9) and (2.10), and the resulting sum can be simplified using Lemma 2.6. Consequently, we obtain

[TABLE]

To determine the number of terms in the truncated series $\sum_{k=1}^{N}\delta_{k}\chi^{2}_{1k}$ that should be used in practice to approximate the asymptotic distribution of $\boldsymbol{T}_{n}^{2}$ , we derive bounds for the eigenvalues $\delta_{k}$ in terms of the $\rho_{\kappa}$ and then obtain a general formula for $N$ as a function of $\alpha$ . We refer to the ratio $(\sum_{k=1}^{N}\delta_{k})/Tr(\mathcal{S})$ as the $N$ th scree ratio for $\boldsymbol{T}_{n}^{2}$ .

Since the operator $\mathcal{S}$ is compact and positive then the set of all its eigenvalues is countable and contains only nonnegative values [72, Theorem 8.12, p. 98]. The next result shows that the eigenvalues indeed are positive.

Proposition 3.17.

The operators $\mathcal{S}$ and $\mathcal{S}_{0}$ are injective; that is, $\mathcal{S}f=\mathcal{S}g$ if and only if $f=g$ , and the same holds for $\mathcal{S}_{0}$ . In particular, [math] is not an eigenvalue of $\mathcal{S}$ or $\mathcal{S}_{0}$ .

Proof. By linearity, it suffices to assume that $g=0$ . So, suppose that $\mathcal{S}f=0$ , that is,

[TABLE]

for all $S>0$ . Then for $U>I_{m}$ , by Fubini’s theorem,

[TABLE]

By the definition of the covariance function $K$ in (3.9),

[TABLE]

By (2.37), (2.14), and Fubini’s theorem, we have

[TABLE]

Also, by (2.4) and (2.9), we have

[TABLE]

and, by (2.10),

[TABLE]

Substituting these results into (3.4) and discarding extraneous factors, we obtain

[TABLE]

Replacing $U$ by $U^{-1}$ , we find that (3.111) is equivalent to

[TABLE]

Differentiating both sides of (3.112) with respect to $U$ , we obtain

[TABLE]

Since $T\stackrel{{\scriptstyle d}}{{=}}HTH^{\prime}$ for all $H\in O(m)$ , and $f(HTH^{\prime})=f(T)$ , then

[TABLE]

Therefore,

[TABLE]

Differentiating both sides of (3.113) with respect to $U$ , we find that

[TABLE]

As this latter integral is a Laplace transform, we obtain $f=0$ , $P_{0}$ -almost everywhere. Also, the same argument may be used in the case of $\mathcal{S}_{0}$ .

Consequently, [math] is not an eigenvalue of $\mathcal{S}$ . ∎

We now derive an interlacing property of the eigenvalues $\delta_{k}$ and $\rho_{\kappa}$ . To state this property, denote by $\xi_{k}$ , $k=1,2,3\ldots$ the partitions of all nonnegative integers, listed in increasing lexicographic order, e.g., $\xi_{1}=(0)$ , $\xi_{2}=(1)$ , $\xi_{3}=(2)$ , $\xi_{4}=(1^{2})$ , $\xi_{5}=(3)$ , $\xi_{6}=(21)$ , $\xi_{7}=(1^{3}),\ldots$

Proposition 3.18.

For all $k\geq 1$ , $\rho_{\xi_{k}}\geq\delta_{k}\geq\rho_{\xi_{k+2}}$ . Further, for $k\geq 3$ , every eigenvalue of $\mathcal{S}_{0}$ is an eigenvalue of $\mathcal{S}$ with multiplicity $p_{m}(k)-2$ , $p_{m}(k)-1$ , or $p_{m}(k)$ .

Proof. Define the kernels $k_{0}(S,T)=-\operatorname{etr}(-(S+T)/\alpha)$ and

[TABLE]

where $S,T>0$ . Also, define on $L^{2}$ the corresponding integral operators,

[TABLE]

$j=0,1$ , $S>0$ . Then it follows from (3.9) that $\mathcal{S}=\mathcal{S}_{0}+\mathcal{U}_{0}+\mathcal{U}_{1}$ .

It is clear that each $\mathcal{U}_{j}$ is self-adjoint and of rank one, i.e., the range of $\mathcal{U}_{j}$ is a one-dimensional subspace of $L^{2}$ . Also, $\mathcal{S}_{0}+\mathcal{U}_{0}$ is self-adjoint, and by following the same steps as in Theorem 3.14, we see that it is positive and compact.

By the same argument as in the proof of Proposition 3.17, we find that the operator $\mathcal{S}_{0}+\mathcal{U}_{0}$ is injective; hence, the eigenvalues of $\mathcal{S}_{0}+\mathcal{U}_{0}$ are positive.

Denote by $\omega_{k}$ , $k\geq 1$ , the eigenvalues of $\mathcal{S}_{0}+\mathcal{U}_{0}$ , where $\omega_{1}\geq\omega_{2}\geq\cdots$ , repeated according to their multiplicities. Since $\mathcal{S}_{0}$ is compact, self-adjoint, and injective, and since $\mathcal{U}_{0}$ is self-adjoint and of rank one, it follows from Hochstadt [36] or Dancis and Davis [20] that the eigenvalues of $\mathcal{S}_{0}$ interlace the eigenvalues of $\mathcal{S}_{0}+\mathcal{U}_{0}$ , i.e., $\rho_{\xi_{1}}\geq\omega_{1}\geq\rho_{\xi_{2}}\geq\omega_{2}\geq\rho_{\xi_{3}}\geq\omega_{3}\geq\rho_{\xi_{4}}\geq\dotsc$ . Further, by Hochstadt [36], every eigenvalue of multiplicity $p_{m}(k)$ , $k\geq 2$ , of $\mathcal{S}_{0}$ , where $p_{m}(k)$ denotes the number of partitions of $k$ in at most $m$ parts, is an eigenvalue of $\mathcal{S}_{0}+\mathcal{U}_{0}$ with multiplicity $p_{m}(k)$ or $p_{m}(k)-1$ .

Since $\mathcal{U}_{1}$ is self-adjoint and of rank one then by applying again Hochstadt’s, or Dancis and Davis’, theorem we find that the eigenvalues of $\mathcal{S}_{0}+\mathcal{U}_{0}$ interlace the eigenvalues of $\mathcal{S}_{0}+\mathcal{U}_{0}+\mathcal{U}_{1}\equiv\mathcal{S}$ , i.e, $\omega_{k}\geq\delta_{k}\geq\omega_{k+1}$ for all $k\geq 1$ .

Combining the conclusions of the preceding paragraphs, we deduce that $\rho_{\xi_{k}}\geq\delta_{k}\geq\rho_{\xi_{k+2}}$ , $k\geq 1$ . Further, by Hochstadt [36], we have for $k\geq 3$ , every eigenvalue of $\mathcal{S}_{0}$ is an eigenvalue of $\mathcal{S}$ with multiplicity $p_{m}(k)-2$ , $p_{m}(k)-1$ , or $p_{m}(k)$ . ∎

For $\epsilon\in(0,1)$ , we can now determine a value for $N$ such that the $N$ th scree ratio of $\boldsymbol{T}_{n}^{2}$ exceeds $1-\epsilon$ . Applying the interlacing inequalities for $\delta_{k}$ , we obtain $\sum_{k=1}^{N}\delta_{k}\geq\sum_{2\leq|\kappa|\leq r}\rho_{\kappa}$ , where $N=\sum_{k=2}^{r}p_{m}(k)$ . Since $Tr(\mathcal{S}_{0})>Tr(\mathcal{S})$ , we advise that $N$ be chosen so that

[TABLE]

This criterion leads to a value for $N$ that is readily applicable in the analysis of data. Substituting $\rho_{\kappa}=\alpha^{m\alpha}b_{\alpha}^{4|\kappa|+2m\alpha}$ and the value of $Tr(\mathcal{S}_{0})$ from (3.4), we obtain

[TABLE]

For $m=2,3$ and $\epsilon=10^{-10}$ , which represents accuracy to ten decimal places, we present in Tables 1 and 2 the values of the lower bounds on $r$ and $N$ for various values of $\alpha$ .

As indicated by Tables 1 and 2, fewer eigenvalues appear to be needed to approximate the distribution of $\mathcal{S}$ as $\alpha$ increases. As we show in the following result, which is partly a consequence of the interlacing property of the eigenvalues, all but one of the $\delta_{k}$ and $\rho_{\kappa}$ converge to [math] as $\alpha\to\infty$ , a result that is consistent with the decreasing values of $r$ and $N$ in the tables.

Corollary 3.19.

As $\alpha\to\infty$ , $\rho_{\kappa}\to 0$ for all $\kappa\neq(0)$ , $\delta_{k}\to 0$ for all $k\geq 2$ , and $\delta_{1}\to e^{-m}(1-e^{-m})$ .

Proof. By (3.74), $\beta=(1+4\alpha^{-1})^{1/2}$ . Expanding this expression as a power series in $\alpha^{-1}$ , we obtain

[TABLE]

Therefore, $(\alpha b_{\alpha}^{2})^{\alpha}\to e^{-1}$ and $b_{\alpha}\to 0$ as $\alpha\to\infty$ . By (3.75), $\rho_{\kappa}=(\alpha b_{\alpha}^{2})^{m\alpha}b_{\alpha}^{4|\kappa|}$ , so it follows that if $\kappa\neq 0$ then $\rho_{\kappa}\to 0$ .

By Proposition 3.18, $\delta_{2}\leq\rho_{(1)}$ , so it follows that $\delta_{2}\to 0$ as $\alpha\to\infty$ . Since the $\delta_{k}$ are nonnegative and listed in non-increasing order then it follows that, as $\alpha\to\infty$ , $\delta_{k}\to 0$ for all $k\geq 2$ .

Finally, the limiting value of $\delta_{1}$ is obtained by taking limits in (3.109). ∎

3.5 An application to financial data

In applying our test to a financial data set, we follow in part an example given by Haff, et al. [34, Example 5.3]. Let us denote by $S_{j,k}$ , for $k=1,2,3$ the daily closing stock prices of Johnson & Johnson (JNJ), Berkshire Hathaway Inc., Class B (BRK-B), and JPMorgan Chase & Co. (JPM) respectively, from November 26, 2017 to November 23, 2018. If a day were a trading holiday, we repeated the observation of the previous day; thus we had 260 observations in total. Then, we computed the daily logarithmic returns $\log(S_{j+1,k}/S_{j,k})$ , for $j=1,\dotsc,260$ and $k=1,2,3$ ; graphs of these logarithmic returns are given in Figure 1. Finally, we partitioned the daily logarithmic returns into biweekly periods and calculated the $3\times 3$ covariance matrix for each biweekly period, resulting in the matrices $X_{1},\dotsc,X_{26}$ .

A common assumption in the literature on stochastic volatility models is that the three-dimensional vectors of daily logarithmic returns,

[TABLE]

$j=1,\ldots,260$ , are mutually independent and identically distributed from a trivariate normal distribution. If this assumption were valid then the corresponding biweekly covariance matrices would be independent and identically distributed with Wishart distributions. Thus, we will test the hypothesis that the biweekly covariance matrices are Wishart-distributed with $9$ degrees-of-freedom, i.e., $\alpha=4.5$ .

To apply the test statistic $\boldsymbol{T}^{2}_{n}$ to test the hypothesis that the data are drawn from a Wishart distribution with $9$ degrees of freedom and unspecified scale matrix $\Sigma$ , we use an algorithm developed by Koev and Edelman [45] in Matlab [66] to evaluate the Bessel functions of two matrix arguments. Applying that algorithm to the data on the stock prices, we find that the observed value of the test statistic $\boldsymbol{T}^{2}_{n}$ is $0.127$ .

We conducted a simulation study to approximate $\boldsymbol{T}^{2}_{n\,;\,0.05}$ , the 95th-percentile of the null distribution of $\boldsymbol{T}_{n}^{2}$ . We generated $10,000$ random samples of size $n=26$ from the Wishart distribution with $\alpha=4.5$ and scale matrix $\Sigma=I_{3}$ , calculated the value of $\boldsymbol{T}^{2}_{n}$ for each sample, and recorded the 95th-percentile of all 10,000 simulated values of $\boldsymbol{T}_{n}^{2}$ . We repeated this process a total of ten times, finally approximating $\boldsymbol{T}^{2}_{n\,;\,0.05}$ as the mean of all 10 simulated 95th-percentiles, viz., $\boldsymbol{T}^{2}_{n\,;\,0.05}=0.002$ . Since the observed value of $\boldsymbol{T}_{n}^{2}$ exceeds the critical value then we reject the null hypothesis that the random matrices $X_{1},.....,X_{26}$ are Wishart-distributed at the 5% level of significance. Moreover, we derived from our simulation study an approximate P-value of $0.000$ for the test. Therefore, we have strong evidence that the three-dimensional vectors of logarithmic returns, $(\log(S_{j+1,1}/S_{j,1}),\log(S_{j+1,2}/S_{j,2}),\log(S_{j+1,3}/S_{j,3}))$ , $j=1,\ldots,260$ , do not have a trivariate normal distribution or are not mutually independent.

For an alternative approach to approximating $\boldsymbol{T}^{2}_{n\,;\,0.05}$ , one can use the limiting null distribution of $\boldsymbol{T}^{2}_{n}$ . For $\alpha=4.5$ , from (3.4), we obtain the approximation $\boldsymbol{T}_{n}^{2}\approx\sum_{k=1}^{21}\delta_{k}\chi^{2}_{1k}$ . This requires that we first calculate the $\delta_{k}$ (that are not equal to $\rho_{\kappa}$ ) and their multiplicities, numerically, using the results of Theorem 3.15, and then we would apply the results of Kotz, et al. [46] to derive the distribution of $\sum_{k=1}^{21}\delta_{k}\chi^{2}_{1k}$ and carry out the test. We recommend in practice the one-term approximation [46, Eqs. (71), (79)],

[TABLE]

which leads to the explicit expression, $\boldsymbol{T}^{2}_{n\,;\,0.05}\simeq\frac{1}{2}(\delta_{1}+\delta_{M})\chi^{2}_{M\,;\,0.05}$ , for an approximate critical value of $\boldsymbol{T}_{n}^{2}$ .

As an alternative to calculating $\delta_{1},\dotsc,\delta_{M}$ , we can apply the interlacing inequalities in Proposition 3.18 to obtain a stochastic upper bound, $\sum_{k=1}^{M}\delta_{k}\chi^{2}_{1k}\leq\sum_{0\leq|\kappa|\leq r}\rho_{\kappa}\chi^{2}_{1\kappa}$ . If we carry out the test by using the upper bound, $\sum_{0\leq|\kappa|\leq r}\rho_{\kappa}\chi^{2}_{1\kappa}$ , with its exact distribution or a one-term approximation obtained from Kotz, et al. [46, loc. cit.], we will obtain a conservative test of the null hypothesis, i.e., with a level of significance at most 5%.

3.6 Consistency of the test

Before stating the theorem, we provide a lemma which will be helpful for establishing consistency of the test. The proof of the following result is similar to the proof of Lemma 3.9.

Lemma 3.20.

For $T>0$ , $Y_{1}>0$ , and $Y_{2}>0$ ,

[TABLE]

Theorem 3.21.

Let $X_{1},X_{2},\dotsc$ be a sequence of $m\times m$ positive-definite, i.i.d. random matrices with mean $\mu$ . Assume also that the p.d.f. of $X_{1}$ is of the form:

[TABLE]

where $f_{0}$ is orthogonally invariant. Let $\gamma\in(0,1)$ denote the level of significance of the test and $c_{n,\gamma}$ be the $(1-\gamma)$ -quantile of the test statistic $\boldsymbol{T}^{2}_{n}$ under $H_{0}$ . If $X_{1},X_{2},\dotsc$ are not Wishart-distributed then

[TABLE]

Proof. By the definition (3.4) of the test statistic and (3.8), we have

[TABLE]

where $Z_{j}=X_{j}^{1/2}\bar{X}_{n}^{-1}X_{j}^{1/2}$ . By subtracting and adding the quantity

[TABLE]

inside the squared term, and then expanding the integrand, we obtain

[TABLE]

We begin by proving that the integral (3.118) converges almost surely to [math]. By (3.115), there exists a constant $C>0$ such that

[TABLE]

since the Frobenius norm is sub-multiplicative. By the triangle inequality, we conclude that the integral (3.118) is bounded above by

[TABLE]

By the Cauchy-Schwarz inequality,

[TABLE]

Since $T>0$ , then $(\operatorname{tr}T^{2})\leq(\operatorname{tr}T)^{2}$ , so we have

[TABLE]

by (2.4) and (2.9).

Moreover, by the Strong Law of Large Numbers and the Continuous Mapping Theorem, $\lVert\bar{X}^{-1}_{n}-\mu^{-1}\rVert_{F}\rightarrow 0$ , almost surely. Also, again by the Strong Law of Large Numbers, $n^{-1}\sum_{j=1}^{n}\lVert X_{j}\rVert_{F}\rightarrow E\lVert X_{1}\rVert_{F}$ , almost surely. It is elementary to verify that $E\lVert X_{1}\rVert_{F}<\infty$ . Since $X_{1}>0$ and $\mu=E(X_{1})>0$ , we have $\lVert X_{1}\rVert_{F}\leq\operatorname{tr}X_{1}$ and so $E\lVert X_{1}\rVert_{F}\leq E(\operatorname{tr}X_{1})=\operatorname{tr}\mu<\infty$ . Therefore, (3.118) converges to 0, almost surely.

Second, we show that (3.6) tends to 0, almost surely. By (2.38), the fact that $\operatorname{etr}(-\alpha^{-1}T)\leq 1$ for $T>0$ , and the triangle inequality, we have

[TABLE]

Further, by the triangle inequality, the absolute value of (3.6) is less than or equal to

[TABLE]

By the Cauchy-Schwarz inequality and the fact that $\int_{T>0}\hskip 1.0pt{\rm{d}}P_{0}(T)=1$ , (3.120) is seen to be less than or equal to

[TABLE]

Following the same argument as for integral (3.118), we conclude that integral (3.6) converges to [math], almost surely.

Since $A_{\nu}(T,X_{j}^{1/2}\mu^{-1}X_{j}^{1/2})=A_{\nu}(T,\mu^{-1/2}X_{j}\mu^{-1/2}),$ we see that the integral (3.117) equals

[TABLE]

We subtract and add inside the squared term the orthogonally invariant Hankel transform of $\mu^{-1/2}X_{1}\mu^{-1/2}$ , i.e., the quantity $E[\Gamma_{m}(\alpha)A_{\nu}(T,\mu^{-1/2}X_{1}\mu^{-1/2})],$ and expand the integrand. Then we find that (3.117) equals

[TABLE]

By the Strong Law of Large Numbers in $L^{2}$ [49, p. 189, Corollary 7.10], we conclude that the term (3.121) converges to 0, almost surely.

Next, we show that (3.122) converges to 0, almost surely. By (2.38) and the fact that $\operatorname{etr}(-\alpha^{-1}T)\leq 1$ for $T>0$ , we have

[TABLE]

Therefore, the absolute value of the integral (3.122) is less than or equal to

[TABLE]

where the latter bound follows from the Cauchy-Schwarz inequality. Again, by the Strong Law of Large Numbers in $L^{2}$ , we conclude that the integral (3.122) converges to 0, almost surely.

We have now shown that

[TABLE]

Denote by $\Delta$ the right-hand side of (3.123); then $\Delta\geq 0$ . Suppose that $\Delta=0$ , then

[TABLE]

equivalently, $\mathcal{\widetilde{H}}_{\mu^{-1/2}X_{1}\mu^{-1/2}}(T)-\operatorname{etr}(-\alpha^{-1}T)=0$ , $P_{0}$ -almost everywhere. By continuity, we obtain $\mathcal{\widetilde{H}}_{\mu^{-1/2}X_{1}\mu^{-1/2}}(T)-\operatorname{etr}(-\alpha^{-1}T)=0$ for all $T>0$ . By the Uniqueness Theorem for orthogonally invariant Hankel transforms, it follows that $\mu^{-1/2}X_{1}\mu^{-1/2}$ has a Wishart distribution. By Muirhead [53, p. 92, Theorem 3.2.5], $X_{1}$ has also a Wishart distribution, which contradicts the assumption that $X_{1}$ does not have a Wishart distribution. Therefore, $\Delta>0$ .

Under $H_{0}$ , $n^{-1}\boldsymbol{T}^{2}_{n}\xrightarrow{a.s.}0$ , and therefore $n^{-1}\boldsymbol{T}^{2}_{n}\xrightarrow{p}0$ , i.e., for any $\epsilon>0$ ,

[TABLE]

Thus, for any $\epsilon>0$ and $\gamma>0$ , there exists $n_{0}(\epsilon,\gamma)\in\mathbb{N}$ such that

[TABLE]

for all $n\geq n_{0}(\epsilon,\gamma)$ . Let $c_{n,\gamma}$ be the $(1-\gamma)$ -quantile of the test statistic $\boldsymbol{T}^{2}_{n}$ under $H_{0}$ . Then $0\leq c_{n,\gamma}\leq n\epsilon$ for all $n\geq n_{0}(\epsilon)$ since, by definition, $c_{n,\gamma}:=\inf\{x\geq 0:P_{H_{0}}(\boldsymbol{T}^{2}_{n}>x)\leq\gamma\}$ . Therefore, $0\leq n^{-1}c_{n,\gamma}\leq\epsilon$ for all $n\geq n_{0}(\epsilon)$ . In summary, for any $\epsilon>0$ , there exists $n_{0}(\epsilon)\in\mathbb{N}$ such that $n^{-1}c_{n,\gamma}\leq\epsilon$ for all $n\geq n_{0}(\epsilon)$ , i.e.,

[TABLE]

By (3.123) and (3.124), we have $n^{-1}\boldsymbol{T}^{2}_{n}-n^{-1}c_{n,\gamma}\xrightarrow{a.s.}\Delta$ , and therefore $n^{-1}\boldsymbol{T}^{2}_{n}-n^{-1}c_{n,\gamma}\xrightarrow{p}\Delta$ . Thus, by Severini [61, p. 340, Corollary 11.3 (i)]), we conclude that $n^{-1}\boldsymbol{T}^{2}_{n}-n^{-1}c_{n,\gamma}\xrightarrow{d}\Delta$ . Further,

[TABLE]

Since the distribution function of the constant positive random variable $\Delta$ is continuous at 0, we conclude that

[TABLE]

This concludes the proof. ∎

Remark 3.22.

We show that the assumption (3.116), made in Theorem 3.21, holds for two alternative distributions.

First, the matrix $F$ -distribution [43, Section 4, part (c)] or [38, Eqs. (65), (72)]: Let $X$ be a positive-definite random matrix with p.d.f.

[TABLE]

where $a>\tfrac{1}{2}(m-1)$ and $b>\tfrac{1}{2}(m+1)$ . Since $f(X)$ is orthogonally invariant then, by Schur’s Lemma, there exists a constant $c$ such that $\mu=E(X)=cI_{m}$ .

Last, a linear combination of two Wishart matrices: Let $X$ be a positive-definite random matrix with p.d.f.

[TABLE]

where $a>\tfrac{1}{2}(m-1)$ , $b>\tfrac{1}{2}(m-1)$ , and $\delta>1$ . By [30, Section 4.4], it is known that $X$ is equal in distribution to $X_{1}+\delta^{-1}X_{2}$ , where $X_{1}$ and $X_{2}$ are independent, $X_{1}\sim W_{m}(a,I_{m})$ and $X_{2}\sim W_{m}(b,I_{m})$ . Again, the distribution of $X$ is orthogonally invariant, therefore it satisfies (3.116).

4 Contiguous Alternatives to the Null Hypothesis

In this section, we derive the limiting distribution of the test statistic under a sequence of contiguous alternatives.

4.1 Assumptions

For $n\in\mathbb{N}$ and $m\geq 2$ , let $X_{n1},\dotsc,X_{nn}$ be a triangular array of row-wise independent $m\times m$ random matrices. As usual, let $P_{0}=W_{m}(\alpha,I_{m})$ , $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ , and let $Q_{n1}$ be a probability measure dominated by $P_{0}$ .

We wish to test the hypothesis

[TABLE]

against the alternative

[TABLE]

We write the Radon-Nikodym derivative of $Q_{n1}$ with respect to $P_{0}$ in the form

[TABLE]

We will need two assumptions in the sequel.

Assumptions 4.1.

We assume that:

(A1)

The functions $\{h_{n}:n\in\mathbb{N}\}$ form a sequence of $P_{0}$ -integrable functions converging pointwise, $P_{0}$ -almost everywhere, to a function $h$ , and

(A2)

$\sup_{n\in\mathbb{N}}E_{P_{0}}|h_{n}|^{4}<\infty$ .

Note that since $\int(\hskip 1.0pt{\rm{d}}Q_{n1}/\hskip 1.0pt{\rm{d}}P_{0})\,\hskip 1.0pt{\rm{d}}P_{0}=1$ then we also have $\int{h_{n}}\,\hskip 1.0pt{\rm{d}}P_{0}=0$ , for all $n\in\mathbb{N}$ . Denote the indicator function of an event $A$ by $I(A)$ . By applying (A2), we deduce the uniform integrability of $|h_{n}|^{2}$ :

[TABLE]

By Bauer [9, p. 95, Theorem 2.11.4], the $P_{0}$ -almost everywhere convergence of $h_{n}$ to $h$ implies the $P_{0}$ -stochastic convergence of $h_{n}$ to $h$ . Again by Bauer [9, p. 104, Theorem 2.12.4], the uniform integrability of $|h_{n}|^{2}$ along with the $P_{0}$ -stochastic convergence of $h_{n}$ to $h$ imply the convergence of $h_{n}$ in mean square, i.e.,

[TABLE]

and therefore

[TABLE]

Since convergence in mean square implies convergence in mean, we have

[TABLE]

and thus,

[TABLE]

Now, due to the fact that $\int{h_{n}}\,\hskip 1.0pt{\rm{d}}P_{0}=0$ for all $n\in\mathbb{N}$ , we obtain

[TABLE]

4.2 Examples

In this subsection, we verify that Assumptions 4.1 are valid for a broad collection of sequences of contiguous alternatives.

4.2.1 Wishart alternatives with contiguous scale matrices

Let $Q_{n1}:=W_{m}(\alpha,\Sigma_{n})$ with $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ and $\Sigma_{n}=(1+\frac{1}{\sqrt{n}})I_{m}$ . Then,

[TABLE]

$X>0$ . We equate the Radon-Nikodym derivative to $1+n^{-1/2}h_{n}(X)$ , obtaining

[TABLE]

for $X>0$ . By applying L’Hospital’s rule, we obtain

[TABLE]

for $X>0$ . Next, we find $E_{P_{0}}|h_{n}^{4}|$ . Define

[TABLE]

the remainder term of the Taylor series expansion of $\operatorname{etr}(-n^{-1/2}X)$ , $X>0$ . Then, by elementary algebraic manipulations, we obtain

[TABLE]

By (4.2), the triangle inequality, and the Lipschitz continuity of the exponential function, we have

[TABLE]

$X>0$ . Therefore,

[TABLE]

It is elementary that $(1+n^{-1/2})^{m\alpha-1}\to 1$ and $n^{1/2}\big{(}(1+n^{-1/2})^{m\alpha-1}-1\big{)}\to m\alpha-1$ as $n\to\infty$ ; therefore, there exists a positive constant $M$ such that $(1+n^{-1/2})^{m\alpha-1}\leq M$ and $\big{|}n^{1/2}\big{(}(1+n^{-1/2})^{m\alpha-1}-1\big{)}\big{|}\leq M$ for all $n$ . Therefore, $|h_{n}(X)|\leq M(1+6\operatorname{tr}X)+M=M(2+6\operatorname{tr}X)$ , $X>0$ , so we obtain

[TABLE]

and this bound does not depend on $n$ . By (2.4) and (2.9), the above integral is finite; thus, $\sup_{n\in\mathbb{N}}E_{P_{0}}|h_{n}|^{4}<\infty$ .

4.2.2 Wishart alternatives with contiguous shape parameters

Let $Q_{n1}:=W_{m}(\alpha_{n},I_{m})$ with $\alpha_{n}=\alpha+\frac{1}{\sqrt{n}}$ , $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ . We have

[TABLE]

$X>0$ . Following (4.1), we equate this Radon-Nikodym derivative to $1+n^{-1/2}h_{n}(X)$ , obtaining

[TABLE]

for $X>0$ . Recall the multivariate digamma function

[TABLE]

$z>0$ . Applying L’Hospital’s rule, we obtain

[TABLE]

$X>0$ . To calculate $E_{P_{0}}|h_{n}|^{4}$ , we apply the binomial expansion, obtaining

[TABLE]

thus,

[TABLE]

Next, the Taylor expansion of $\Gamma_{m}(\alpha)/\Gamma_{m}(\alpha+n^{-1/2})$ for sufficiently large values of $n$ is

[TABLE]

where $a_{0}=1$ .

After lengthy but straightforward calculations, we obtain

[TABLE]

Next, we substitute the Taylor expansion (4.4) in (4.3) and then take the limit as $n\to\infty$ . By applying L’Hospital’s rule four times then, after some lengthy but straightforward calculations, we obtain

[TABLE]

Thus, $E_{P_{0}}|h_{n}^{4}|$ is a bounded sequence, and therefore $\sup_{n\in\mathbb{N}}E_{P_{0}}|h_{n}|^{4}<\infty$ .

4.2.3 Contaminated Wishart models

Consider the contamination model,

[TABLE]

where, as usual, $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ . We note that contaminated Wishart models appear also in the analysis of diffusion tensor images [40].

We have

[TABLE]

for $X>0$ . Following (4.1), we equate this Radon-Nikodym derivative to $1+n^{-1/2}h_{n}(X)$ , obtaining

[TABLE]

for $X>0$ . Thus,

[TABLE]

$X>0$ . Since

[TABLE]

clearly is finite and does not depend on $n$ then $\sup_{n\in\mathbb{N}}E_{P_{0}}|h_{n}|^{4}<\infty$ .

We note also that the model (4.5) is a special case of the contamination model

[TABLE]

where $P_{1}$ is a probability measure dominated by $P_{0}$ , and $\int{(\hskip 1.0pt{\rm{d}}P_{1}/\hskip 1.0pt{\rm{d}}P_{0})^{4}}\,\hskip 1.0pt{\rm{d}}P_{0}<\infty$ . The preceding calculations can also be done for many choices of $P_{1}$ .

For example, consider the case in which $P_{1}$ is the probability measure corresponding to the matrix generalized inverse Gaussian distribution [15] with density function

[TABLE]

$X>0$ , where $c_{1}$ is the normalizing constant, $\Phi$ and $\Psi$ are symmetric non-negative definite matrices, and $b\in\mathbb{R}$ . Then

[TABLE]

where $c_{0}=1/\Gamma_{m}(\alpha)$ is the normalizing constant of $W_{m}(\alpha,I_{m})$ and $c=c_{1}^{4}/c_{0}^{3}$ . By [35, p. 506] and [15, Eq. (2)], we deduce that $\int{(\hskip 1.0pt{\rm{d}}P_{1}/\hskip 1.0pt{\rm{d}}P_{0})^{4}}\,\hskip 1.0pt{\rm{d}}P_{0}<\infty$ in the following cases:

(i)

$\Phi\geq 0$ , $\Psi-\tfrac{3}{4}I_{m}>0$ , $b\geq\tfrac{1}{4}(3\alpha+\tfrac{1}{2}m)$

(ii)

$\Phi>0$ , $\Psi-\tfrac{3}{4}I_{m}>0$ , $b\in\mathbb{R}$

(iii)

$\Phi>0$ , $\Psi-\tfrac{3}{4}I_{m}\geq 0$ , $b<\tfrac{1}{4}(3\alpha-\tfrac{1}{2}(m-1))$

Therefore, we deduce that the Assumptions 4.1 also hold for broad classes of the model $Q_{n2}$ .

4.3 The distribution of the test statistic under contiguous alternatives

Let $P_{0}=W_{m}(\alpha,I_{m})$ , $\alpha>\max\{\tfrac{1}{2}(2m-1),\tfrac{1}{2}(m+3)\}$ ; and denote by $\boldsymbol{P_{n}}=P_{0}\otimes\cdots\otimes P_{0}$ and $\boldsymbol{Q_{n}}=Q_{n1}\otimes\cdots\otimes Q_{n1}$ the $n$ -fold product probability measures of $P_{0}$ and $Q_{n1}$ , respectively.

Theorem 4.2.

Let $m\geq 2$ and $X_{n1},\dotsc,X_{nn}$ , $n\in\mathbb{N}$ , be a triangular array of $m\times m$ positive-definite row-wise i.i.d. random matrices, where $X_{nj}=X_{j}$ , $j=1,\dotsc,n$ . We assume that the distribution of $X_{nj}$ is $Q_{n1}$ , for every $j=1,\dotsc,n$ . Further, let $\mathcal{Z}_{n}=(\mathcal{Z}_{n}(T),T>0)$ be a random field with

[TABLE]

$T>0$ . Under the Assumptions 4.1, there exists a centered Gaussian field $\mathcal{Z}:=(\mathcal{Z}(T),T>0)$ with sample paths in $L^{2}$ and the covariance function $K(S,T)$ in (3.9), and a function

[TABLE]

$T>0$ , such that $\mathcal{Z}_{n}\xrightarrow{d}\mathcal{Z}+c$ in $L^{2}$ . Moreover, as $n\rightarrow\infty$ ,

[TABLE]

We note that the proof of this theorem and the subsequent results can be obtained by following the approach in [65, pp. 79–91] and Theorem 4.3 in [33]. In order to maintain a relatively self-contained presentation, we provide some of the details here.

Before proceeding to those details, we will present some preliminary results. Consider the log-likelihood ratio,

[TABLE]

From the definition of $\boldsymbol{P}_{n}$ and $\boldsymbol{Q}_{n}$ , we obtain

[TABLE]

Since $\Lambda_{n}=-\infty$ if and only if $1+n^{-1/2}h_{n}(X_{nj})=0$ for some $j$ , we obtain

[TABLE]

Since $h_{n}(X_{n1})=-n^{1/2}$ if and only if $|n^{-1/2}h_{n}(X_{n1})|^{4}=1$ then

[TABLE]

Under the assumption that $\sup_{n\in\mathbb{N}}E_{P_{0}}(|h_{n}|^{4})<\infty$ , we obtain

[TABLE]

as $n\to\infty$ . Therefore, without loss of generality, we shall assume that $\Lambda_{n}>-\infty$ and $1+n^{-1/2}h_{n}(X_{nj})>0$ for all $j=1,\dotsc,n$ and $n\geq 1$ (see [65, p. 140, Appendix D.2] or [71, p. 303, Example 6.118]).

The Taylor expansion of order $2$ of the function $\log(1+n^{-1/2}h_{n}(X_{nj}))$ , at $1$ is

[TABLE]

with remainder term

[TABLE]

where $t_{nj}:\mathcal{P}_{+}^{m\times m}\to[0,1]$ is a measurable function. Therefore,

[TABLE]

In the following result, we use the notation $\sigma^{2}:=\int{|h|^{2}}\,\hskip 1.0pt{\rm{d}}P_{0}$ .

Lemma 4.3.

As $n\to\infty$ ,

(i)

$n^{-1/2}\sum_{j=1}^{n}h_{n}(X_{nj})\xrightarrow{d}\mathcal{N}(0,\sigma^{2})$ * in $\boldsymbol{P}_{n}$ -distribution.*

(ii)

$n^{-1}\sum_{j=1}^{n}h_{n}^{2}(X_{nj})\rightarrow\sigma^{2}$ * in $\boldsymbol{P}_{n}$ -probability.*

(iii)

$\sum_{j=1}^{n}R(h_{n}(X_{nj}))\rightarrow 0$ * in $\boldsymbol{P}_{n}$ -probability.*

The proofs of these results are given in [65, pp. 80-83] and in [32]. Combining these three results, we conclude that under $\boldsymbol{P}_{n}$ ,

[TABLE]

We introduced in Section 3.2 the random field

[TABLE]

$T>0$ . Also, we introduced in Theorem 3.5, the centered random field

[TABLE]

$T>0$ , where

[TABLE]

We proved that there exists a centered Gaussian field $\mathcal{Z}:=(\mathcal{Z}(T),T>0)$ with sample paths in $L^{2}$ and with covariance function $K(S,T)$ given in (3.9) such that, under $\boldsymbol{P_{n}}$ , $||\mathcal{Z}_{n}-\mathcal{Z}_{n,3}||_{L^{2}}\xrightarrow{p}0$ and $\mathcal{Z}_{n,3}\xrightarrow{d}\mathcal{Z}$ in $L^{2}$ . For $k\in\mathbb{N}$ and $T_{1},\dotsc,T_{k}\in\mathcal{P}_{+}^{m\times m}$ , it follows from the multivariate Central Limit Theorem that $\big{(}\mathcal{Z}_{n,3}(T_{1}),\dotsc,\mathcal{Z}_{n,3}(T_{k})\big{)}^{\prime}\xrightarrow{d}\mathcal{N}_{k}\big{(}0,\Sigma\big{)}$ under $\boldsymbol{P_{n}}$ , where $\Sigma=\big{(}K(T_{i},T_{j})\big{)}_{1\leq i,j\leq k}$ is the $k\times k$ positive definite matrix with $(i,j)$ th entry $K(T_{i},T_{j})$ .

Let $\|\cdot\|_{\mathbb{R}^{k+1}}$ denote the standard Euclidean norm on $\mathbb{R}^{k+1}$ . Then, by Lemma 4.3(iii),

[TABLE]

in $\boldsymbol{P}_{n}$ -probability.

Lemma 4.4.

For $T>0$ , define

[TABLE]

and set $\boldsymbol{c}=\big{(}c(T_{1}),\dotsc,c(T_{k})\big{)}^{\prime}$ . Then, under $\boldsymbol{P_{n}}$ ,

[TABLE]

and

[TABLE]

Proof.

Substituting for $\mathcal{Z}_{n,3}$ in (4.8), applying Assumptions 4.1, and carrying out some straightforward calculations, we obtain

[TABLE]

for $T>0$ . Letting

[TABLE]

then

[TABLE]

To establish (4.9), we will apply the Cramér-Wold device. Then it suffices to establish that for every $\boldsymbol{u}=(u_{1},\dotsc,u_{k+1})^{\prime}\in\mathbb{R}^{k+1}$ ,

[TABLE]

Now, let $Y_{1},Y_{2},\dotsc$ be i.i.d. $P_{0}$ -distributed $m\times m$ random matrices, and define

[TABLE]

Under $\boldsymbol{P_{n}}$ , $n^{-1/2}\sum_{j=1}^{n}k_{n}(Y_{j})$ has the same distribution as

[TABLE]

$\boldsymbol{u}\in\mathbb{R}^{k+1}$ . Since $E(w(T_{i},Y_{1}))=0$ , $i=1,\dotsc,k$ then

[TABLE]

Denote by $\tau^{2}_{n}$ the variance of $k_{n}(Y_{1})$ . Then,

[TABLE]

By Assumptions 4.1, we obtain $\mathop{\rm Var\,}\nolimits(h^{2}_{n}(Y_{1}))<\infty$ and ${\mathop{\rm Cov}}\big{(}h_{n}(Y_{1}),h^{2}_{n}(Y_{1})\big{)}<\infty.$ Thus, as $n\rightarrow\infty$ ,

[TABLE]

Similarly, it can be shown that, as $n\rightarrow\infty$ ,

[TABLE]

$P_{0}$ -almost surely. In addition, we notice that

[TABLE]

For every $\epsilon>0$ ,

[TABLE]

Also, for every $\epsilon>0$ ,

[TABLE]

from which we conclude that as $n\rightarrow\infty$ ,

[TABLE]

$P_{0}$ -almost surely. As the results (4.12) – (4.16) are the sufficient conditions in Pratt’s version of the Dominated Convergence Theorem [31, p. 221, Theorem 5.5], we conclude that as $n\to\infty$ ,

[TABLE]

This result is equivalent to the Lindeberg condition, i.e., for every $\epsilon>0$ ,

[TABLE]

Thus, we deduce from the Lindeberg-Feller Central Limit Theorem that

[TABLE]

therefore,

[TABLE]

Note also that $(0,\dotsc,0,-2^{-1}\sigma^{2})\boldsymbol{u}=-2^{-1}\sigma^{2}u_{k+1}$ and that

[TABLE]

Therefore, (4.11) is proved. Finally, (4.10) follows from (4.3), (4.11), and [10, p. 25, Theorem 4.1]. ∎

Now, we proceed to the proof of Theorem 4.2.

Proof of Theorem 4.2.

By (4.6) and Le Cam’s first lemma (see [65, p. 140, Theorem D.5] or [71, p. 311, Corollary 6.124]), $\boldsymbol{P}_{n}$ and $\boldsymbol{Q}_{n}$ are mutually contiguous. Also, by (4.10) and Le Cam’s third lemma (see [65, p. 141, Theorem D.6] or [71, p. 329, Corollary 6.139]), under $\boldsymbol{Q}_{n}$ ,

[TABLE]

By [65, p. 138, Theorem D.2] or [71, p. 56, Theorem 5.51], the convergence in distribution of $\mathcal{Z}_{n,3}$ under $\boldsymbol{P}_{n}$ in $L^{2}$ implies that $\mathcal{Z}_{n,3}$ is tight in $L^{2}$ under $\boldsymbol{P}_{n}$ . Further, since $\boldsymbol{Q}_{n}$ is contiguous to $\boldsymbol{P}_{n}$ , by [65, p. 139, Theorem D.4] or [71, p. 295, Theorem 6.113 (a)], $\mathcal{Z}_{n,3}$ is tight in $L^{2}$ under $\boldsymbol{Q}_{n}$ .

By (4.17) and the tightness of $\mathcal{Z}_{n,3}$ in $L^{2}$ under $\boldsymbol{Q}_{n}$ , we obtain $\mathcal{Z}_{n,3}\xrightarrow{d}\mathcal{Z}+c$ under $\boldsymbol{Q}_{n}$ (see [18, Theorem 2, Example 4]). Moreover, since $\|\mathcal{Z}_{n}-\mathcal{Z}_{n,3}\|_{L^{2}}\xrightarrow{p}0$ under $\boldsymbol{P}_{n}$ and $\boldsymbol{Q}_{n}$ is contiguous to $\boldsymbol{P}_{n}$ , we have under $\boldsymbol{Q}_{n}$ , $\|\mathcal{Z}_{n}-\mathcal{Z}_{n,3}\|_{L^{2}}\xrightarrow{p}0$ . Thus, by Billingsley [10, p. 25, Theorem 4.1], we obtain $\mathcal{Z}_{n}\xrightarrow{d}\mathcal{Z}+c$ under $\boldsymbol{Q}_{n}$ .

Finally, by the Continuous Mapping Theorem [10, p. 31, Corollary 1], we have $\lVert\mathcal{Z}_{n}\rVert^{2}_{L^{2}}\xrightarrow{d}\lVert\mathcal{Z}+c\rVert^{2}_{L^{2}}$ under $\boldsymbol{Q}_{n}$ , i.e.,

[TABLE]

under $\boldsymbol{Q}_{n}$ . The proof now is complete. ∎

5 The Efficiency of the Test

In this Section, we investigate the approximate Bahadur slope of the test statistic $\boldsymbol{T}_{n}^{2}$ under local alternatives. Further, we show the validity of a modified Wieand condition. The proof of Wieand’s condition, under which the Bahadur and Pitman efficiencies agree, remains an open problem. By applying the results of this section, we are able to calculate the approximate asymptotic relative efficiency (ARE) of the proposed test relative to potential alternative tests.

For $m\geq 2$ , let $X_{1},X_{2},\dotsc$ be i.i.d., $m\times m$ positive-definite random matrices with unknown distribution $P$ . We assume that $P$ is indexed by a parameter $\theta\in\Theta:=(-\eta,\eta)$ , for some $\eta>0$ . We let $\theta\in\Theta_{0}=\{\theta_{0}\}=\{0\}$ to represent the null hypothesis and $\theta\in\Theta_{1}=\Theta\setminus\{0\}$ to represent the alternative hypothesis. In Section 3, we showed that $\boldsymbol{T}^{2}_{n}$ is scale-invariant, i.e., it does not depend on the unknown scale matrix $\Sigma$ . Thus, under the null hypothesis $\theta_{0}=0$ , we assume that $X_{1},X_{2},\dotsc$ are i.i.d., $m\times m$ positive-definite $P_{0}$ -distributed random matrices and under the local alternatives, represented by $\theta\in\Theta_{1}$ , we assume that $X_{1},X_{2},\dotsc$ are i.i.d., $m\times m$ positive-definite $P_{\theta}$ -distributed random matrices.

The Radon-Nikodym derivative of $P_{\theta}$ with respect to $P_{0}$ is $dP_{\theta}/{dP_{0}}=1+\theta h_{\theta}$ . We assume that as $\theta\rightarrow 0$ , the function $h_{\theta}$ converges to some function $h$ in mean square, i.e.,

[TABLE]

Since $\int{(\hskip 1.0pt{\rm{d}}P_{\theta}/\hskip 1.0pt{\rm{d}}P_{0})}\,\hskip 1.0pt{\rm{d}}P_{0}=1$ , we have

[TABLE]

for $\theta\in\Theta_{1}$ . Further, we shall assume that for $\theta\in\Theta_{1}$ ,

[TABLE]

5.1 The approximate Bahadur slope of the test

For a description of the approximate Bahadur slope of a test under local alternatives and for the definition of a standard sequence, we refer to Bahadur [5, 6], Taherizadeh [65, Chapter 5] or to Section 5 in [33].

We have the following result for the test statistic $\boldsymbol{T}_{n}^{2}$ .

Theorem 5.1.

The sequence of test statistics $\{\boldsymbol{T}_{n}:n\in\mathbb{N}\}$ is a standard sequence. Further, $a=\tilde{\delta}_{1}^{-1}$ , the inverse of the largest eigenvalue of the covariance operator $\mathcal{S}$ ,

[TABLE]

and

[TABLE]

Proof. The proof of this theorem follows along the lines of the proof of Theorem 5.1 in [33]. For completeness, we provide the details here.

First, we will establish that $\{\boldsymbol{T}_{n}:n\in\mathbb{N}\}$ is a standard sequence. In Section 3, we showed that the limiting null distribution of the test statistic $\boldsymbol{T}^{2}_{n}$ is the same as that of $\sum_{k\geq 1}\tilde{\delta}_{k}\chi^{2}_{N(\tilde{\delta}_{k})}$ , where $\tilde{\delta}_{k}$ , $k\geq 1$ is an enumeration, listed in non-increasing order, of the distinct eigenvalues of $\mathcal{S}$ with corresponding multiplicities $N(\tilde{\delta}_{k})$ , and $\{\chi^{2}_{N(\tilde{\delta}_{k})}\}$ are i.i.d. $\chi^{2}_{N(\tilde{\delta}_{k})}$ -distributed random variables. From the Monotone Convergence Theorem, we have

[TABLE]

which is finite since $\mathcal{S}$ is of trace-class. Thus, $\sum_{k\geq 1}\tilde{\delta}_{k}\chi^{2}_{N(\tilde{\delta}_{k})}$ is almost surely a positive random variable with continuous probability distribution function.

By Zolotarev [73],

[TABLE]

where $o_{p}(1)\xrightarrow{t\rightarrow\infty}0$ . Therefore,

[TABLE]

which converges to $\tilde{\delta}_{1}^{-1}$ as $t\rightarrow\infty$ .

By assumption (5.3), for $\theta\in\Theta_{1}$ ,

[TABLE]

From the proof of Theorem 3.21, we have

[TABLE]

Since $\hskip 1.0pt{\rm{d}}P_{\theta}/\hskip 1.0pt{\rm{d}}P_{0}=1+\theta h_{\theta}$ then, by (5.2),

[TABLE]

and then it follows that $n^{-1}\boldsymbol{T}^{2}_{n}\xrightarrow{p}b^{2}(\theta)$ , the function defined in (5.4). Therefore, $n^{-1/2}\boldsymbol{T}_{n}\xrightarrow{p}b(\theta)$ in $P_{\theta}$ -probability, so the sequence of test statistics $\{\boldsymbol{T}_{n}:n\in\mathbb{N}\}$ is a standard sequence.

Finally, we find the limiting approximate Bahadur slope, as $\theta\rightarrow 0$ . By applying the Cauchy-Schwarz inequality, (2.38), and assumption (5.1), it is straightforward to establish that

[TABLE]

Therefore,

[TABLE]

The proof is now complete. ∎

5.2 A modified form of Wieand’s condition

Wieand [69] showed that if two standard sequences of test statistics satisfy an additional condition, now called the Wieand condition, then the limiting approximate Bahadur efficiency is in accord with the limiting Pitman efficiency, as the level of significance decreases to [math]. For a description about Pitman’s asymptotic relative efficiency, we refer to Taherizadeh [65, Chapter 5] or to Section 5 in [33]. Although the proof of Wieand’s condition remains an open problem in the matrix setting, we show that a modified form of Wieand’s condition is valid for the test statistics $\{\boldsymbol{T}_{n}:n\in\mathbb{N}\}$ .

Theorem 5.2.

There exists a constant $\theta^{*}>0$ such that for any $\epsilon>0$ and $\gamma\in(0,1)$ , there exists a constant $C>0$ such that

[TABLE]

for any $\theta\in\Theta_{1}\cap(-\theta^{*},\theta^{*})$ and $n^{1/2}>C/b^{2}(\theta)$ .

Proof. For $T>0$ and $\theta\in\Theta$ , consider the orthogonally invariant Hankel transform,

[TABLE]

We have

[TABLE]

By adding and subtracting the term $\mathcal{H}_{X_{1},\theta}(T)$ inside the squared term, and then applying Minkowski’s inequality, we obtain

[TABLE]

Now set

[TABLE]

By adding and subtracting the term

[TABLE]

inside the squared term, and then again applying Minkowski’s inequality, we get

[TABLE]

Combining (5.5) and (5.6), we conclude that

[TABLE]

Further, by subtracting and adding the term

[TABLE]

inside the squared term

[TABLE]

and then applying the Cauchy-Schwarz inequality, we obtain

[TABLE]

Next, by (3.115),

[TABLE]

Since

[TABLE]

and since the trace is invariant under cyclic permutations and the Frobenius norm is sub-multiplicative then

[TABLE]

By the Cauchy-Schwarz inequality,

[TABLE]

Since $\bar{X}_{n}^{-1/2}X_{j}\bar{X}_{n}^{-1/2}$ is a positive definite matrix then

[TABLE]

and therefore,

[TABLE]

Therefore,

[TABLE]

and by (5.9), we obtain

[TABLE]

By (5.7), Markov’s inequality, and Fubini’s theorem,

[TABLE]

By (5.8) and (5.10), we see that (5.11) is greater than or equal to

[TABLE]

In Theorem 3.21, we showed that $\tilde{C}:=\int_{T>0}{\lVert T\rVert_{F}}\,\hskip 1.0pt{\rm{d}}P_{0}(T)<\infty$ . Further, by (2.38),

[TABLE]

therefore

[TABLE]

Next, we write

[TABLE]

and expand the sum. By the Cauchy-Schwarz inequality, and using the i.i.d. property of $X_{1},\ldots,X_{n}$ , we obtain

[TABLE]

Squaring the above sum and using the fact that $X_{1},\ldots,X_{n}$ are i.i.d., we obtain

[TABLE]

Since $E_{0}h_{\theta}(X)=0$ and, by (5.3), $E_{0}Xh_{\theta}(X)=0$ for $\theta\in\Theta_{1}$ , then $E_{0}(1+\theta h_{\theta}(X_{1}))=1$ and

[TABLE]

Thus, the first term in the right-hand side of (5.2) equals

[TABLE]

and the second term equals [math].

Further, by applying the Cauchy-Schwarz inequality, we also find that

[TABLE]

To show that $E_{0}\big{(}\operatorname{tr}\big{[}(X_{1}-\alpha I_{m})^{2}\big{]}\big{)}^{2}$ is finite, we write

[TABLE]

and since $(a+b+c)^{2}\leq 3(a^{2}+b^{2}+c^{2})$ , for $a,b,c\in\mathbb{R}$ , it is sufficient to show that $E_{0}(\operatorname{tr}X_{1}^{2})^{2}<\infty$ and $E_{0}(\operatorname{tr}X_{1})^{2}<\infty$ . However,

[TABLE]

by (2.9). By another application of (2.9),

[TABLE]

By assumption (5.1), we conclude that there exists $\theta^{*}\in(0,\eta)$ such that

[TABLE]

Therefore, (5.12) can be written as

[TABLE]

for all $\theta\in(-\theta^{*},\theta^{*})$ . Setting $C=\big{(}8\alpha^{-2}m^{5/2}\,\tilde{C}\bar{\sigma}+2\big{)}/\epsilon^{2}\gamma$ then, for all $\theta\in(-\theta^{*},\theta^{*})$ and $n^{1/2}>C/b^{2}(\theta)$ ,

[TABLE]

The proof now is complete. ∎

Bibliography73

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1]
2[2] Anfinsen, S. N., and Eltoft, T. Application of the matrix-variate Mellin transform to analysis of polarimetric radar images. IEEE Transactions on Geoscience and Remote Sensing , 49 , 2281–2295, 2011.
3[3] Anfinsen, S. N., Doulgeris, A. P., and Eltoft, T. Goodness-of-fit tests for multilook polarimetric radar data based on the Mellin transform. IEEE Transactions on Geoscience and Remote Sensing , 49 , 2764–2781, 2011.
4[4] Asai, M., Mc Aleer, M., and Yu, J. Multivariate stochastic volatility: a review. Econometric Reviews , 25 , 145–175, 2006.
5[5] Bahadur, R. R. Stochastic comparison of tests. Annals of Mathematical Statistics , 31 , 276–295, 1960.
6[6] Bahadur, R. R. Some Limit Theorems in Statistics . SIAM, Philadelphia, PA, 1971.
7[7] Baringhaus, L., and Taherizadeh, F. Empirical Hankel transforms and their applications to goodness-of-fit tests. Journal of Multivariate Analysis , 101 , 1445–1467, 2010.
8[8] Baringhaus, L., and Taherizadeh, F. A K-S type test for exponentiality based on empirical Hankel transforms. Communications in Statistics - Theory and Methods , 42 , 3781–3792, 2013.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Integral Transform Methods in Goodness-of-Fit Testing, II: The Wishart Distributions

Abstract

Contents

1 Introduction

2 Wishart Distributions and Hankel Transforms of Matrix Argument

2.1 Preliminary results for the Wishart distributions

2.2 Bessel functions and Laguerre polynomials of matrix argument

Lemma 2.1**.**

Lemma 2.2**.**

Proof.

2.3 Hankel transforms of matrix argument

Lemma 2.3**.**

Proof.

Example 2.4**.**

Example 2.5**.**

Theorem 2.6**.**

Proof.

Theorem 2.7**.**

Theorem 2.8**.**

Proof.

Theorem 2.9**.**

Proof.

2.4 Orthogonally invariant Hankel transforms of matrix argument

Definition 2.10**.**

Remark 2.11**.**

Example 2.12**.**

Theorem 2.13**.**

Proof.

3 Goodness-of-Fit Tests for the Wishart Distributions

3.1 The test statistic

Lemma 3.1**.**

Proof.

Proposition 3.2**.**

Proof.

3.2 The limiting null distribution of the test statistic

Lemma 3.3**.**

Remark 3.4**.**

Theorem 3.5**.**

3.2.1 Preliminary details

Lemma 3.6**.**

Proof.

Lemma 3.7**.**

Lemma 3.8**.**

Proof.

Lemma 3.9**.**

Proof.

Lemma 3.10**.**

Lemma 3.11**.**

Lemma 3.12**.**

3.2.2 The proof of the limiting distribution

3.3 Eigenvalues and eigenfunctions of the covariance operator

Theorem 3.13**.**

Proof.

Theorem 3.14**.**

Theorem 3.15**.**

Proof.

Remark 3.16**.**

3.4 An interlacing property of the eigenvalues

Proposition 3.17**.**

Proposition 3.18**.**

Corollary 3.19**.**

3.5 An application to financial data

3.6 Consistency of the test

Lemma 3.20**.**

Theorem 3.21**.**

Remark 3.22**.**

4 Contiguous Alternatives to the Null Hypothesis

4.1 Assumptions

Assumptions 4.1**.**

4.2 Examples

4.2.1 Wishart alternatives with contiguous scale matrices

4.2.2 Wishart alternatives with contiguous shape parameters

4.2.3 Contaminated Wishart models

4.3 The distribution of the test statistic under contiguous alternatives

Lemma 2.1.

Lemma 2.2.

Lemma 2.3.

Example 2.4.

Example 2.5.

Theorem 2.6.

Theorem 2.7.

Theorem 2.8.

Theorem 2.9.

Definition 2.10.

Remark 2.11.

Example 2.12.

Theorem 2.13.

Lemma 3.1.

Proposition 3.2.

Lemma 3.3.

Remark 3.4.

Theorem 3.5.

Lemma 3.6.

Lemma 3.7.

Lemma 3.8.

Lemma 3.9.

Lemma 3.10.

Lemma 3.11.

Lemma 3.12.

Theorem 3.13.

Theorem 3.14.

Theorem 3.15.

Remark 3.16.

Proposition 3.17.

Proposition 3.18.

Corollary 3.19.

Lemma 3.20.

Theorem 3.21.

Remark 3.22.

Assumptions 4.1.

Theorem 4.2.

Lemma 4.3.

Lemma 4.4.

Theorem 5.1.

Theorem 5.2.