Inference For High-Dimensional Split-Plot-Designs: A Unified Approach for Small to Large Numbers of Factor Levels
Paavo Sattler, Markus Pauly

TL;DR
This paper develops robust inference procedures for high-dimensional split-plot designs with many factors and groups, applicable in life sciences where large numbers of observations per subject are common.
Contribution
It introduces a unified approach for inference in heteroscedastic split-plot designs with high-dimensional data, extending classical methods to handle increasing dimensions and groups.
Findings
Procedures are robust against increasing dimensions and groups.
Limit distributions are characterized in a general asymptotic framework.
Small sample approximations improve inference accuracy.
Abstract
Statisticians increasingly face the problem to reconsider the adaptability of classical inference techniques. In particular, divers types of high-dimensional data structures are observed in various research areas; disclosing the boundaries of conventional multivariate data analysis. Such situations occur, e.g., frequently in life sciences whenever it is easier or cheaper to repeatedly generate a large number of observations per subject than recruiting many, say , subjects. In this paper we discuss inference procedures for such situations in general heteroscedastic split-plot designs with independent groups of repeated measurements. These will, e.g., be able to answer questions about the occurrence of certain time, group and interactions effects or about particular profiles. The test procedures are based on standardized quadratic forms involving suitably symmetrized…
| chosen | True asymptotic level of the test | |||
|---|---|---|---|---|
| level | () | () | () | () |
| 0.10 | 0.10 | 0.09354 | 0.11391 | 0.10 |
| 0.05 | 0.05 | 0.06819 | 0.02226 | 0.05 |
| 0.01 | 0.01 | 0.03834 | 0.00003 | 0.01 |
| Hypothesis | p-value | ||
|---|---|---|---|
| -0.45671 | 1.19030 | 0.55832 | |
| 6.24114 | 7.07832 | 0.00008 | |
| 0.74578 | 7.21217 | 0.20120 | |
| -0.795083 | 461.874 | 0.784463 | |
| -0.591851 | 360.048 | 0.71764 | |
| -0.43381 | 223.24000 | 0.65845 | |
| -1.18382 | 426.083 | 0.88385 | |
| 2.37921 | 155.89025 | 0.01285 | |
| 0.23757 | 156.64141 | 0.39240 | |
| –0.49984 | 143.57718 | 0.68099 | |
| -0.72716 | 91.83337 | 0.75968 | |
| -0.56510 | 79.78169 | 0.70183 | |
| -0.66704 | 130.56430 | 0.74046 |
| d | 5 | 10 | 20 | 40 | 70 | 100 | 150 | 200 | 300 | 450 | 600 | 800 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| d | 5 | 10 | 20 | 40 | 70 | 100 | 150 | 200 | 300 | 450 | 600 | 800 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.50 | 0.36 | 0.21 | 0.11 | 0.064 | 0.045 | 0.03 | 0.022 | 0.015 | 0.010 | 0.0074 | 0.0056 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Inference For High-Dimensional Split-Plot-Designs:
A Unified Approach for Small to Large Numbers of Factor Levels
Inference For High-Dimensional Split-Plot-Designs:
A Unified Approach for Small to Large Numbers of Factor Levels
Paavo Sattler1 and Markus Pauly1
1University of Ulm, Institute of Statistics
Abstract: Statisticians increasingly face the problem to reconsider the adaptability of classical inference techniques. In particular, divers types of high-dimensional data structures are observed in various research areas; disclosing the boundaries of conventional multivariate data analysis. Such situations occur, e.g., frequently in life sciences whenever it is easier or cheaper to repeatedly generate a large number of observations per subject than recruiting many, say , subjects. In this paper we discuss inference procedures for such situations in general heteroscedastic split-plot designs with independent groups of repeated measurements. These will, e.g., be able to answer questions about the occurrence of certain time, group and interactions effects or about particular profiles.
The test procedures are based on standardized quadratic forms involving suitably symmetrized U-statistics-type estimators which are robust against an increasing number of dimensions and/or groups . We then discuss its limit distributions in a general asymptotic framework and additionally propose improved small sample approximations. Finally its small sample performance is investigated in simulations and the applicability is illustrated by a real data analysis.
**Keywords: **Approximations, High-dimensional Data, Quadratic Forms, Repeated Measures, Split-plot designs
1 Introduction
In our current century of data, statisticians increasingly face the problem to reconsider the adaptability of classical inferential techniques. In particular, divers types of high-dimensional data structures are observed in various research areas; disclosing the boundaries of conventional multivariate data analysis. Here, the curse of high dimensionality or the large small problem is especially encountered in life sciences whenever it is easier (or cheaper) to repeatedly generate a large number of observations per subject than recruiting many, say , subjects. Similar observations can be made in industrial sciences with subjects replaced by units. Such designs, where experimental units are repeatedly observed under different conditions or at different time points, are called repeated measures designs or (if two or more groups are observed) split-plot designs. In these trials, one likes to answer questions about the occurrence of certain group or time effects or about particular profiles. Conventionally, for , corresponding null hypotheses are inferred with Hotelling’s (one or two sample case) or Wilks’s , see e.g. [13][Section 4.3] or [21] [Section 6.8]. Besides normality, these procedures heavily rely on the assumption of equal covariance matrices and particularly break down in high-dimensional settings with . While there exist several promising approaches to adequately deal with the problem of covariance heterogeneity in the classical case with (see e.g. [6, 16, 17, 20, 27, 37, 1, 24, 9, 32, 35, 26, 18, 15]) most procedures for high-dimensional repeated measures designs rely on certain sparsity conditions (see e.g. [2, 11, 23, 30, 34, 10, 19] and the references cited therein). In particular, in an asymptotic framework, typical assumptions restrict the way the sample size and/or various powers of traces of the underlying covariances increase with respect to . These type of sparsity conditions guarantee central limit theorems that lead to approximations of underlying test statistics by a fixed limit distribution. However, as illustrated in [31] for one-sample repeated measures these conditions can in general not be regarded as regularity assumptions. In particular, they may even fail for classical covariance structures. To this end, the authors proposed a novel approximation technique that showed considerably accurate results and investigated its asymptotic behavior in a flexible and non-restrictive framework. Here, no assumptions regarding the dependence between and or the covariance matrix were made. In the current paper, we follow this approach and extend the results of [31] to general heteroscedastic split-plot designs with independent groups of repeated measurements. To even allow for a large number of groups as in [3, 4] or [39], we do not only consider the case with a fixed number of samples but additionally allow for situations with . The latter case is of particular interest if most groups are rather small (as in screening trials) such that a classical test would essentially possess no power for fixed . Here increasing the number of groups implies increasing the total sample size from which a power increase might be expected as well. This leads to one of the following asymptotic frameworks
[TABLE]
which we handle simultaneously in the sequel. For all considerations, the adequate and dimension-stable estimation of traces of certain powers of combined covariances turned out to be a major problem. It is tackled by introducing novel symmetrized estimates of -statistics-type which possess nice asymptotic properties under all asymptotic frameworks given above.
The paper is organized as follows. The statistical model together with the considered hypotheses of interest are introduced in Section 2. The test statistic and its asymptotic behavior is investigated in Section 3, where also novel dimension-stable trace estimators are introduced. Additional approximations for small sample sizes are theoretically discussed in Section 4 and their performance is studied in simulations in Section 5. Afterwards, the new methods will be applied to analyze a high-dimensional data set from a sleep-laboratory trial in Section 6. The paper closes with a discussion and an outlook. All proofs in this paper are shifted to the supplementary material.
2 Statistical Model and Hypotheses
We consider a split-plot design given by independent groups of -dimensional random vectors
[TABLE]
with mean vectors and positive definite covariance matrices . Here denotes the individual subjects or units in group , , where no specific structure of the group-specific covariance matrices is assumed. In particular, they are even allowed to differ completely. Altogether we have a total number of random vectors representing observations from independent subjects. Within this framework, a factorial structure on the factors group or time can be incorporated by splitting up indices. Also, a group-specific random subject effect can be incorporated as outlined in [31][Equation (2.2)].
Writing , linear hypotheses of interest in this general split-plot model are formulated as
[TABLE]
for a proper hypothesis matrix . It is of the form , where and refer to subplot (time) and/or whole-plot (group) effects. For theoretical considerations it is often more convenient to reformulate by means of the corresponding projection matrix , see e.g. [31]. Here denotes some generalized inverse of the matrix and can equivalently be written as . It is a simple exercise to prove that the matrix is of the form for projection matrices and , see A.1 (p.A.1) in the supplement. Typical examples are given by
- (a)
No group effect:
,
- (b)
No time effect:
,
- (c)
No interaction effect between time and group:
,
where is the d-dimensional matrix only containing 1s and is the centring matrix. For interpretational purposes it is sometimes helpful to decompose the component-wise means as
[TABLE]
where represents the -th group effect, the time effect at time point and the -interaction effect between group and time with the usual side conditions . With this notation the above null hypothesis can be rewritten as (a) , (b) and (c) , respectively.
These and other hypotheses will be utilized in the data analysis Section 6.
3 The Test Statistic and its Asymptotics
We derive appropriate inference procedures for and analyze their asymptotic properties under the following asymptotic frameworks
[TABLE]
as . Here, no dependency on how the dimension in (3) and (5) or the number of groups in (4)-(5) converges to infinity with respect to the sample sizes and is postulated. In particular, we cover high-dimensional ( or even ) as well as low-dimensional settings. For a lucid presentation of subsequent results and proofs we additionally assume throughout that
[TABLE]
However, by turning to convergent subsequences, all results can be shown to hold under the more general condition
[TABLE]
It is convenient to measure deviations from the null hypothesis by means of the quadratic form
[TABLE]
where with denotes the vector of pooled group means.
Since is in general asymptotically degenerated under (3)-(5) we study its standardized version. To this end, note that under the null hypothesis it holds that
[TABLE]
due to assumption (1). Thus, it follows from classical theorems about moments of quadratic forms, see e.g. [29] or A.4 in the supplement, that its mean and variance under the null hypothesis can be expressed as
[TABLE]
[TABLE]
Henceworth we investigate the asymptotic behaviour (under ) of the standardized quadratic form . Denoting by the inversely weighted combined covariance matrix the representation theorem for quadratic forms [29][p.90], implies that
[TABLE]
Here ’’ denotes equality in distribution, are the eigenvalues of in decreasing order, and is a sequence of independent -distributed random variables. Note, that the eigenvalues also depend on the dimension and the sample sizes . Transferring the results of [31] for the one-group design with to our general setting, we obtain the subsequent asymptotic null distributions of the standardized quadratic form for all asymptotic settings (3)-(5).
Theorem 3.1:
Let \beta_{s}={\lambda_{s}}\Big{/}{\sqrt{\sum_{\ell=1}^{ad}\lambda_{\ell}^{2}}} for . Then has, under , and one of the frameworks (3)-(5) asymptotically
- a)
a standard normal distribution if
[TABLE]
- b)
a standardized distribution if
[TABLE]
- c)
the same distribution as the random variable , if
[TABLE]
for a decreasing sequence in with .
It is worth to note that the influence of the different asymptotic frameworks are hidden in the corresponding conditions on the sequence of standardized eigenvalues , which depend on both, and .
Since these quantities are unknown in general we cannot apply the result directly. In particular, we are not even able to calculate the test statistic , not to mention to choose its correct limit distribution. To this end, we first introduce novel unbiased estimates of the unknown traces involved in (8)-(3) and discuss their mathematical properties. Plugging them into (8)-(3) leads to the calculation of adequately standardized test statistics. Finally, the choice of proper critical values is discussed in Section 4.
3.1 Symmetrized Trace Estimators
Here we derive unbiased and ratio-consistent estimates for the unknown traces and given in (8)-(3). Since it is not obvious that the usual plug-in estimates that are based on empirical covariance matrices are useful in high-dimensional settings we follow the approach of [8, 31] and directly estimate the traces. Different, to the one-sample design studied therein we face the problem of additional nuisance parameters – the mean vectors . To avoid their estimation we adopt Tyler’s symmetrization trick from -estimates of scatter (see e.g. [12], [14] or [36]) to the present situation, see also [7]. In particular, we consider differences of observation pairs from the same group which fulfill and introduce the following novel estimators for
[TABLE]
Here and throughout the paper expressions of the kind mean that the indices are pairwise different. In this sense all estimators (11)-(14) are symmetrized U-statistics, where the kernel is given by a specific quadratic or bilinear form. Their properties are analyzed below.
Lemma 3.1:
For any and it holds that
* is an unbiased and ratio-consistent estimator for .* 2. 2.
* is an unbiased and ratio-consistent estimator for * 3. 3.
* and are unbiased and ratio-consistent estimators for and respectively.*
Remark 3.1:
*(a) Recall that an -valued estimator is ratio-consistent for a sequence of real parameters iff in probability as . Here the estimators and parameters may depend on and/or .
(b) Studying the proof of Lemma 3.1 given in the supplementary material in detail, we see that all estimators are even (dimension-)stable in the sense of [8], i.e. they fulfill and for sequences not depending on and . *
It follows from Lemma 3.1 that
[TABLE]
is an unbiased estimator of . This motivates to study the standardized quadratic form
[TABLE]
for testing . Its asymptotic behaviour under is summarized below.
Theorem 3.2:
Under and one of the frameworks (3)-(5) the statistic has the same asymptotic limit distributions as , if the respective conditions (a)-(c) from Theorem 3.1 are fulfilled.
The result shows that it is not reasonable to approximate the unknown distribution of the test statistic with a fixed distribution to obtain a valid test procedure. For example, choosing , the -quantile of the standard-normal distribution (), as critical value would lead to a valid asymptotic level test in case of , i.e. . However, for we would obtain which may lead to an asymptotically liberal ( or ) or conservative () test decision, see Table 1. Contrary, choosing as critical value (where denotes the -quantile of the -distribution) for the test , it follows that if but for , where denotes the cumulative distribution function of . Again we obtain an asymptotically liberal () or extremely conservative ( or ) test decision, see the last column of Table 1.
Hence, an indicator (i.e. estimator) for whether , or betwixt would be desirable. Nevertheless, even if the tests with fixed critical values are asymptotically correct ( in case of or in case of ), their true type--error control may be poor for small sample sizes, see the simulations in Section 5.1.
Thus, in any case it seems more appropriate to approximate by a sequence of standardized distributions as already advocated in [31] for the case of . We will propose such approximations in the next Sections, where also a check criterion for or is presented.
4 Better Approximations
To motivate the subsequent approximation, recall from (10) that is of weighted -form. Following [40] it is reasonable to approximate statistics of this from by a standardized -distribution such that the first three moments coincide. Straightforward calculations show that this is achieved by approximating with
[TABLE]
In case of this simplifies to the method presented in [31]. There it has already been seen that the approximation (15) performs much better for smaller sample sizes and/or dimensions than the above approaches with a fixed distribution. We will later rediscover this observation in Section 5 for our present design with general . The next theorem gives a mathematical reason for this approximation.
Theorem 4.1:
Under the conditions of Lemma 3.1 and one of the frameworks (3)-(5) we have that given in (15) has, under , asymptotically
- a)
a standard normal distribution if as ,
- b)
a standardized distribution if as .
Thus, compared to the approximation with a fixed limit distribution, the -approach would at least be asymptotically correct whenever while always providing a three moment approximation to the test statistic. To apply this result, an estimator for in (15) is needed. Since we have already found as unbiased and ratio-consistent estimator for , it remains to find an adequate one for . A combination of both will then lead to a proper estimator for and , respectively. Again we prefer a direct estimation of the involved traces. To this end, we introduce normal random vectors
[TABLE]
with for all . Note, that this vectors are multivariat normal distributed with and . Utilizing their particular form, it is shown in the supplement, that a cyclic combination of these random vectors yield an unbiased estimator for . In particular, writing for we have
[TABLE]
This motivates the definition of (for )
[TABLE]
where
[TABLE]
[TABLE]
[TABLE]
Its properties together with a consistent estimator for are summarized below.
Lemma 4.1:
*(a) The estimator given in (17) is unbiased for .
(b) Suppose that is fixed. Then is a consistent estimator for as
, i.e. we have convergence in probability*
[TABLE]
(c) Now suppose that and that there exists some such that . Then (18) even holds under the asymptotic frameworks (4) - (5).
Theorem 4.2:
Suppose (18). Then, Theorem 4.1 remains valid if we replace by its estimator .
Remark 4.2:
*(a) Using similar arguments as in the proof of Lemma 8.1. of [31] we obtain the equivalences and . Thus, can also be used as check criterion for these two cases.
(b) It is also possible to derive a consistent estimator for , a key quantity in [11], see the supplement for details concerning the estimator. The corresponding approximation by the sequence even shares the same asymptotic properties of the Pearson approximation (15) stated in Theorem 4.1 and Theorem 4.2. However, it only provides a two moment approximation which turned out to perform worse in simulations (results not shown).
(c) In the supplement, we additionally present an unbiased estimator for such that is consistent for in all asymptotic frameworks (3) - (5). Particularly, the extra condition is not needed. However, it is computationally more expensive compared to and thus omitted here.*
In practical applications, the computation costs for are nevertheless rather high. This leads to disproportional waiting times for -values of the corresponding approximate test , where the critical value is given as -quantile of . Therefore, we propose a certain subsampling-type method. Since the unbiasedness of clearly stems from (16) it seems reasonable to proceed as follows: For each and we independently draw random subsamples of length from and store them in a joint random vector . Then, a subsampling-version of the estimator is given by
[TABLE]
Letting as it is easy to see (see the supplement for details), that has the same asymptotic properties as . In particular, it is stated in the supplement that is a consistent estimator for and that the approximation has the same weak limits as stated in Theorem 4.2. This leads to which is an asymptotically exact test whenever . The finite sample, dimension and group size performance of this approximation are investigated in the subsequent section.
5 Simulations
In the previous sections we considered the asymptotic properties of the proposed inference methods which are valid for large sample and fixed or possibly large dimension and/or group sizes. Here we investigate the small sample properties of our proposed approximation procedure
in comparison to the statistical tests and based on fixed critical values. In particular, we compare these procedures in simulation studies with respect to
- (a)
their type I error rate control under the null hypothesis (Section 5.1) and
- (b)
their power behaviour under various alternatives (Section 5.2).
All simulations were performed with the help of the R computing environment (R Development Core Team, 2013), each with simulation runs.
5.1 Asymptotic distribution and Type I error control
First we study the speed of convergence, i.e. type I error control, of the three different tests under the null hypothesis. To be in line with the simulation results presented in [31] for the case we also multiplied the statistic by to avoid a slightly liberal behaviour.
Due to the abundance of different split-plot designs and the more methodological focus of the paper, we restrict our simulation study to two specific null hypotheses and a high dimensional and heteroscedastic two-sample setting. In particular, we investigate the type-I-error behaviour of all three tests for the null hypotheses
- •
and
- •
.
In both cases sample sizes were chosen from and combined with various choices of dimensions . For the covariance matrices a heteroscedastic setting with autoregressive structures and was chosen and for each simulation run subsamples were drawn.
Note that these settings imply for and for , see the supplement for details.
Thus, is asymptotically exact in both cases while and posses the asymptotic behaviour given in Table 1. In particular, the -test should be rather liberal for testing for and strongly conservative for . All these theoretical findings can be recovered in our simulations: The results for , displayed in Figure 1, show an inflated type I error level control of around for smaller samples sizes (). For larger sample sizes () it stabilizes in the region of its asymptotic level of . Moreover, the error control is only slightly effected by the varying dimensions under investigation. In comparison, the two asymptotically correct tests and are slightly liberal for smaller sample sizes and more or less asymptotically correct for moderate () to larger sample sizes. Here, it is astonishing that both procedures are nearly superposable, suggesting a fast convergence of the degrees of freedom estimator .
The results for , presented in Figure 2, are slightly different. In particular, both the tests and depending on fixed critical values are more effected by the underlying dimension: For smaller the true level is considerably larger than their asymptotic level given in Table 1; resulting in a rather liberal behaviour of and close to exact type I error control for . This effect is decreased with increasing sample sizes. Moreover, for larger dimension () both tests approach their asymptotic level. In comparison, the procedure based on the approximation shows a fairly good level control through all dimension and sample size settings. Making this the method of choice.
5.2 Power Performance
We examined the power of the three procedures. Again a heteroscedastic two group split-plot design with autoregressive covariance structures ( and ) was selected. The alpha level () and the null hypotheses were chosen as above ( and ). The investigated alternatives were
- •
a trend alternative for both hypotheses with and for and additionally
- •
a shift alternative for with and and
- •
a one-point alternative for , with and ,
each with increased . We only considered the moderate sample size setting with and together with three choices of dimensions . The results can be found in Figures 3 and 4.
It can be readily seen that the power depends on the type of alternative: For the trend (Figure 3) and the shift alternative (left panel of Figure 4) the power gets larger with increasing dimension. This is essentially apparent for the shift alternative, where the power increases considerably from to . Contrary, for the one-point alternative the power becomes smaller for higher dimensions (right panel of Figure 4). However, this is as expected since a difference in one single component can be detected more easily for smaller .
6 Analysis of a sleep laboratory data set
Finally, the new methods are exemplified on the sleep laboratory trial reported in [22]. In this two-armed repeated measures trial, the activity of prostaglandin-D-synthase (-trace) was measured every 4 hours over a period of 4 days. The grouping factor was gender and the above repeated measures were observed on young healthy men (group ) and women (group ). Since each day presented a certain sleep condition the repeated measures are structured by two crossed fixed factors:
- •
intervention (with levels: normal sleep, sleep deprivation, recovery sleep and REM sleep deprivation) and
- •
time (with the levels/time points and ).
Due to we are thus dealing with a high-dimensional split-plot design with groups and repeated measures. The time profiles of each subject are displayed in Figure 5 (for the female group ) and Figure 6 (for the male group ). We note, that group-specific profile analysis could already be performed by the methods given in [31]. In particular, they found a significant intervention and a borderline time effect for the male group. For the current two-sample design additional questions concern (1) whether there is a gender effect, i.e. the time profiles of the groups differ, and if so (2) whether they differ with respect to certain interventions.
Moreover, investigations regarding (3) a general effect of time and (4) interactions between the different factors are of equal interest. Utilizing the notation from Section 2, the corresponding null hypotheses can be formalized via adequate contrast matrices. In particular, we are interested in testing the null hypotheses
- (a)
No gender effect:
- (b)
No time effect: ,
- (c)
No interaction effect between time and group: ,
- (d)
No time effect for intervention , :
,
- (e)
No effect between interventions and , :
,
where denotes the Kronecker delta. Applying the test based on the standardized quadratic form as test statistic and the proposed -approximation with subsamples we obtain the results summarized in Table 2 .
There it can be readily seen that most hypotheses cannot be rejected at level . In particular, there is no evidence for an overall gender effect, so that we have not performed post-hoc analyses on the interventions. Only a highly significant time effect, as well as a significant effect between the first two interventions (normal sleep and sleep deprivation), could be detected. However, applying a multiplicity adjustment (Bonferroni or Holm) only the time effect remained significant.
7 Conclusion & Outlook
In this paper we have investigated inference procedures for general split-plot models, allowing for unbalanced and/or heteroscedastic covariance settings as well as a factorial structure on the whole- and sub-plot factors. Inspired by the work of [31] for one group repeated measures designs the test statistics were based on standardized quadratic forms. However, different to their work novel symmetrized -statistics were introduced to adequately handle the problem of additional nuisance parameters in the multiple sample case.
To jointly cover low and highdimensional models as well as situations with a small or large number of groups we conducted an in-depth study of their asymptotic behaviour under a unified asymptotic framework. In particular, the number of groups and dimensions may be fixed as in classical asymptotic settings, or even converge to infinity. Here we do neither postulate any assumptions on how and/or and the underlying sample sizes converge to infinity nor any sparsity conditions on the covariance structures since such assumptions are usually hard to check for a practical data set at hand. As a consequence it turned out that the test statistic posses a whole continuum of asymptotic limits that depend on the eigenvalues of the underlying covariances. We thus argued that an approximation by a fixed critical value is not adequate and proposed an approximation by a sequence of standardized -distributions with estimated degrees of freedom. For computational efficiency we additionally provided a subsampling-type version of the degrees of freedom estimator. Our approach provides a reasonably good three moment approximation of the test statistic and is even asymptotically exact if the influence of the largest eigenvalue is negligible (leading to a standard normal limit) or decisive (leading to a standardized limit).
Apart from these asymptotic considerations we evaluated the finite sample and dimension performance of our approximation technique. In particular, for varying combinations of sample sizes and dimensions, we compared its power and type I error control with test procedures based on fixed critical values. In all designs it showed a quite accurate error control over all low- () to highdimensional situations (with up to ). In comparison, its performance was considerably better than that of the other two tests which partially disclosed a rather liberal or conservative behaviour.
In future research we like to extend the current results to general highdimensional MANOVA designs, where we also like to relax the involved assumption of multivariate normality and/or even test simultaneously for mean and covariance effects as recently proposed in [28]. These investigations, however, require completely different (e.g., martingale) techniques and estimators of the involved traces. Moreover, we also plan to conduct more detailed simulations (especially for larger group sizes and other covariance matrices) in a more applied paper.
Acknowledgement
The authors would like to thank Edgar Brunner for helpful discussions. This work was supported by the German Research Foundation project DFG-PA 2409/4-1.
Supplementary Material to
**’Inference For High-Dimensional Split-Plot-Designs:
A Unified Approach for Small to Large Numbers of Factor Levels’
** Paavo Sattler1 and Markus Pauly1
1University of Ulm, Institute of Statistics
Abstract. In this supplement we present all theoretical derivations and computations that were omitted in the paper for lucidity.
Appendix A Appendix
We start with some preliminary results and Lemmatas.
A.1 Basics
In Section 2 of the main paper we claimed that the unique projection matrix to the hypothesis matrix that equivalently describes the null is given by the product of two projection matrices . We start with the proof of this claim:
Lemma A.1:
Let be with . For each hypothesis with such a matrix exist projectors which can be used to formulate the same null hypothesis with .
- Proof:
It is known that the projector fulfills . For this reason and utilizing well known rules ( see for example [33] ) for generalized inverses we obtain
[TABLE]
Thus, and are projectors, i.e. idempotent and symmetric. ∎
For proofing our main results we have to compare various traces of powers of combinations underlying covariance matrices. To this end, we will particularly apply the following inequalities:
Lemma A.2:
For positive real numbers a,b and a symmetric matrix it holds
[TABLE]
For symmetric with eigenvalues it holds that
[TABLE]
If is positive definite and symmetric and is idempotent and symmetric it holds for every that
[TABLE]
- Proof:
The first part is an application of the Cauchy–Bunyakovsky–Schwarz inequality, with the Frobenius inner product. Therefore
[TABLE]
The second part just uses the binomial theorem together with the condition for :
[TABLE]
Finally, the last inequality follows from the second one, if we show that all conditions are fulfilled. With idempotence of and invariance of the trace under cyclic permutations, it follows for all that
[TABLE]
Thus, it is sufficient to consider this term. Since is symmetric all powers are symmetric too and it follows with that
[TABLE]
since and are positive definite and . So both conditions of the second inequation are shown and
[TABLE]
∎
Furthermore, an inequality for traces which contain and is needed.
Lemma A.3:
Let be positive definite and symmetric matrices and suppose that is idempotent and symmetric. Then it holds for that
[TABLE]
- Proof:
As shown before and are symmetric and positive semidefinite. For this reason, a symmetric matrix exists with . Due the fact that all matrices are symmetric it holds
[TABLE]
and because is positive semidefinite also
[TABLE]
*This allows to use the inequalities from above for this matrix, and again utilizing the invariance of the trace under cyclic permutations we obtain
\begin{array}[]{ll}\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{\Sigma}_{r}\right)^{2}\right)&=\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{T}\boldsymbol{\Sigma}_{r}\boldsymbol{T}\cdot\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{T}\boldsymbol{\Sigma}_{r}\boldsymbol{T}\right)=\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{W}\boldsymbol{W}\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{W}\boldsymbol{W}\right)\\ &=\operatorname{tr}\left(\boldsymbol{W}\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{W}\boldsymbol{W}\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{W}\right)=\operatorname{tr}\left(\left(\boldsymbol{W}\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{W}\right)^{2}\right)\\ &\leq\operatorname{tr}^{2}\left(\boldsymbol{W}\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{W}\right)=\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{W}\boldsymbol{W}\right)=\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{T}\boldsymbol{\Sigma}_{r}\boldsymbol{T}\right)\\ &=\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{i}\boldsymbol{T}\boldsymbol{\Sigma}_{r}\right).\end{array}
∎
To standardize the quadratic form we also have to calculate its moments. Here, the following theorem helps:
Theorem A.4:
Let be a symmetric matrix and where is positive definite. Then with it holds,
[TABLE]
with for and .
- Proof:
The proof can be found on page 53 in [29]. ∎
Korollar A.5:
*Let be a symmetric matrix and and independent, where are positive definite. Then we have for all that
\begin{array}[]{ll}{\mathbb{E}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{X}}\right)^{1}\right)=\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\right),\\[5.59721pt] {\mathbb{E}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{X}}\right)^{2}\right)=2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\right)^{2}\right)+\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\right)\stackrel{{\scriptstyle\ref{Spur1}}}{{=}}\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\right)\right),\\[7.74998pt] \operatorname{{\it Var}}\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{X}}\right)=\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\right)\right),\end{array}
\begin{array}[]{l}{\mathbb{E}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)^{1}\right)=0,\\[4.30554pt] {\mathbb{E}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)^{2}\right)=\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{Y}\right),\\[4.30554pt] {\mathbb{E}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)^{3}\right)=0,\\[4.30554pt] {\mathbb{E}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)^{4}\right)=6\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{Y}\right)^{2}\right)+3\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{Y}\right),\end{array}
\begin{array}[]{l}\operatorname{{\it Var}}\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)=\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{Y}\right),\\[4.30554pt] \operatorname{{\it Var}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)^{2}\right)=6\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{Y}\right)^{2}\right)+2\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{Y}\right),\\[8.61108pt] \frac{4N}{n_{i}^{2}n_{r}^{2}}\operatorname{{\it Var}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)^{2}\right)\stackrel{{\scriptstyle\ref{Spur2}}}{{=}}\mathcal{O}\left(\operatorname{tr}^{2}\left(\left(\frac{N}{n_{i}}\boldsymbol{T}\boldsymbol{\Sigma}_{X}\cdot\frac{N}{n_{r}}\boldsymbol{T}\boldsymbol{\Sigma}_{Y}\right)^{2}\right)\right).\end{array}
*Moreover, for
\begin{array}[]{l}\operatorname{{\it Var}}\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)=\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{X}\right)=\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{X}\right)\right),\\[8.61108pt] \operatorname{{\it Var}}\left(\left({\boldsymbol{X}}^{\top}\boldsymbol{T}{\boldsymbol{Y}}\right)^{2}\right)\stackrel{{\scriptstyle\ref{Spur1}}}{{=}}\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{\Sigma}_{X}\boldsymbol{T}\boldsymbol{\Sigma}_{X}\right)\right).\end{array}*
- Proof:
Using the inequalities for traces and with the bilinear form written as
[TABLE]
all equations follows with the previous theorem. ∎
Lemma A.6:
Let be a real random variable with , a sequence with , and a sequence with then it holds
- •
**
- •
**
For they are especially ratio-consistent.
- Proof:
For arbitrary the Tschebyscheff inequality leads to
[TABLE]
Consider the limit for justifies the consistency and using this for leads to ratio-consistency. The second part follows identically. ∎
This result is especially true if or only depends on n resp. .
For completeness we state a straightforward application of the Cauchy–Bunyakovsky–Schwarz inequality:
Lemma A.7:
For real random variables it holds
[TABLE]
and so for identically distributed
[TABLE]
The next result gives equivalent conditions for :
Lemma A.8:
Let be again the eigenvalues of sorted so that is the biggest one. Then it follows
[TABLE]
[TABLE]
Moreover we know This Lemma also holds if is replaced by or .
- Proof:
This follows from Lemma 8.1 given in the supplement in [31][page 21] since their result does not depend on the concrete matrix, i.e. can be directly applied for . Moreover, the different asymptotic frameworks do not influence the proof since they are hidden within the above convergences. ∎
To prove the properties of the subsampling-type estimators some auxiliaries are needed. In particular, the following lemma allows us to decompose the variances and to use conditional terms for the calculation.
Lemma A.9:
Let be a real random variable and denote by a -field. Then it holds that
[TABLE]
- Proof:
With the rules for conditional expectations we calculate
[TABLE]
The result follows by sum up this both parts. ∎
We will apply the result for certain amounts (i.e. numbers) of pairs below. There, for each and we independently draw random subsamples of length from and store them in a joint random vector . Besides we define .
Lemma A.10:
Let be the amount of pairs , which fulfill and have totally different elements and analogue . As long as for all , it holds
[TABLE]
and for
[TABLE]
*where denotes the number of elements.
Let be the amount of pairs fulfilling and and moreover and have totally different elements. If it holds
[TABLE]
- Proof:
*Because never contains pairs of the kind (k,k) the maximal number of elements is . The fact that two vectors have no element in common, even at different components, is denoted as .
The number of totally different pairs can be seen as a binomial distribution with elements, and to calculate the necessary probability independence is used. With the fact that all combinations in this situation have the same probability it follows that*
[TABLE]
If two times elements are picked from there are possibilities, where in of them both -tuples are totally different. This leads to the stated probability and with the mean of the binomial distribution we get
[TABLE]
All in all we calculate
[TABLE]
For and less multiplications are needed, so the results follow. ∎
If (for example B could be chosen proportional to N) these terms converge to zero, disregarding the number of groups or of m.
A.2 Proofs of Section 3
- Proof of Theorem 3.1 (p.3.1):
*The proof of this lemma is very similar to the one from [31][Theorem 2.1]. Due to the fact that a finite sum of multivariate normally distributed random variables is again multivariate normally distributed, the representation theorem can be used to (distributionally equivalently) express the quadratic form as .
The only differences to [31][Theorem 2.1] are that in the case of more groups the eigenvalues do not only depend on but also on the and and that there are more terms to sum. The first point has only an influence on the limit of the . The higher number of summands does not matter because we observe the asymptotic under the asymptotic frameworks (4)-(5), for which at least or converge to infinity. The proofs from [31][Theorem 2.1] only need the representation from above, a number of summations which goes to infinity and the conditions on the limits of the . Since these are fulfilled the proof can be conducted in the same way. ∎
- Proof of 3.1 (p.3.1):
*Remember that with and , trace estimators were defined by
\begin{array}[]{l}A_{i,1}\hskip 5.69046pt=\frac{1}{2\cdot\binom{n_{i}}{2}}\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1},\ell_{2}=1\\ \ell_{1}>\ell_{2}\end{subarray}}^{n_{i}}\left({\boldsymbol{X}}_{i,\ell_{1}}-{\boldsymbol{X}}_{i,\ell_{2}}\right)^{\top}\boldsymbol{T}_{S}\left({\boldsymbol{X}}_{i,\ell_{1}}-{\boldsymbol{X}}_{i,\ell_{2}}\right),\end{array}\\ \\ \begin{array}[]{l}A_{i,r,2}=\frac{1}{4\cdot\binom{n_{i}}{2}\binom{n_{r}}{2}}{\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1},\ell_{2}=1\\ \ell_{1}>\ell_{2}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{1},k_{2}=1\\ k_{1}>k_{2}\end{subarray}}^{n_{r}}\left[\left({\boldsymbol{X}}_{i,\ell_{1}}-{\boldsymbol{X}}_{i,\ell_{2}}\right)^{\top}\boldsymbol{T}_{S}\left({\boldsymbol{X}}_{r,k_{1}}-{\boldsymbol{X}}_{r,k_{2}}\right)\right]^{2}},\end{array}\\ \\ \begin{array}[]{l}A_{i,3}\hskip 5.69046pt=\frac{1}{4\cdot 6\binom{n_{i}}{4}}\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1},\ell_{2}=1\\ \ell_{1}>\ell_{2}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{2}=1\\ k_{2}\neq\ell_{1}\neq\ell_{2}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{1}=1\\ \ell_{2}\neq\ell_{1}\neq k_{1}>k_{2}\end{subarray}}^{n_{i}}\left[\left({\boldsymbol{X}}_{i,\ell_{1}}-{\boldsymbol{X}}_{i,\ell_{2}}\right)^{\top}\boldsymbol{T}_{S}\left({\boldsymbol{X}}_{i,k_{1}}-{\boldsymbol{X}}_{i,k_{2}}\right)\right]^{2},\end{array}\\ \\ \begin{array}[]{l}A_{4}\hskip 11.38092pt=\sum_{i=1}^{a}\left(\frac{N}{n_{i}}\right)^{2}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}+2\sum_{i=1}^{a}\sum_{r=1,r<i}^{a}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}.\end{array}*
For we know and for totally different indices the are statistically independent. So the previous lemmata can be used to calculate the moments. The unbiasedness can be shown by calculating the expectation values for each estimator
[TABLE]
*The following argument will be used several times in this work with small differences, so incidentally it will be more detailed.
*To check the variance we recognize first that is 0 if all indices are totally different, so just combinations remain. Instead of calculating the covariances of the remaining quadratic forms it is easier to use lemmata from above. By using the fact that all quadratic forms are identically distributed, we can calculate the variances which are all the same so it is just the number of remaining combinations multiplied with the variances. This leads to:
\begin{array}[]{ll}\operatorname{{\it Var}}\left(A_{i,1}\right)&=\frac{1}{4\cdot\binom{n_{i}}{2}^{2}}\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1},\ell_{2}=1\\ \ell_{1}>\ell_{2}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1}^{\prime},\ell_{2}^{\prime}=1\\ \ell_{1}^{\prime}>\ell_{2}^{\prime}\end{subarray}}^{n_{i}}\operatorname{{\it Cov}}\left[{{\boldsymbol{Y}}_{i,\ell_{1},\ell_{2}}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,\ell_{1},\ell_{2}}\hskip 1.42271pt\large{\textbf{;}}\normalsize\hskip 1.42271pt{{\boldsymbol{Y}}_{i,\ell_{1}^{\prime},\ell_{2}^{\prime}}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,\ell_{1}^{\prime},\ell_{2}^{\prime}}\right]\\ &\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{\binom{n_{i}}{2}-\binom{n_{i}-2}{2}}{4\binom{n_{i}}{2}}\operatorname{{\it Var}}\left[{{\boldsymbol{Y}}_{i,1,2}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,1,2}\right]+\frac{\binom{n_{i}-2}{2}}{4\binom{n_{i}}{2}}\cdot 0\\[6.45831pt] &\stackrel{{\scriptstyle\ref{QF4}}}{{=}}\frac{\binom{n_{i}}{2}-\binom{n_{i}-2}{2}}{4\binom{n_{i}}{2}}{\mathcal{O}\left(\operatorname{tr}^{2}\left(2\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)}\\[8.61108pt] &=\mathcal{O}\left(n_{i}^{-1}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right).\end{array}
*With these values we know for that
\begin{array}[]{ll}{\mathbb{E}}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}\right)=\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}{\mathbb{E}}\left(A_{i,1}\right)=\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)\end{array}
*and
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}}{{\mathbb{E}}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}\right)}\right)&=\frac{\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}\operatorname{{\it Var}}(A_{i,1})}{\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\\ &\leq\frac{\sum\limits_{i=1}^{a}\mathcal{O}\left(n_{i}^{-1}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)}{\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}\left(\frac{\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}}{{\mathbb{E}}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}\right)}\right)}&\leq\frac{\mathcal{O}\left(\frac{1}{n_{\min}}\right)\cdot\mathcal{O}\left(\sum\limits_{i=1}^{a}\operatorname{tr}^{2}\left(\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)}{\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\end{array}*
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}\left(\frac{\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}}{{\mathbb{E}}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}\right)}\right)}&\leq\frac{\mathcal{O}\left(\frac{1}{n_{\min}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)}{\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}=\mathcal{O}\left(\frac{1}{n_{\min}}\right).\end{array}
So the conditions for an unbiased and ratio-consistent estimator are fulfilled.
The same steps with a different number of remaining combinations leads to
\begin{array}[]{ll}{\mathbb{E}}\left(A_{i,3}\right)&={\frac{1}{4\cdot 6\binom{n_{i}}{4}}\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1},\ell_{2}=1\\ \ell_{1}>\ell_{2}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{1},k_{2}=1\\ \ell_{2}\neq\ell_{1}\neq k_{1}>k_{2}\neq\ell_{1}\neq\ell_{2}\end{subarray}}^{n_{i}}{\mathbb{E}}\left(\left[{{\boldsymbol{Y}}_{i,\ell_{1},\ell_{2}}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,k_{1},k_{2}}\right]^{2}\right)}\\ &\stackrel{{\scriptstyle\ref{QF4}}}{{=}}\frac{1}{4\cdot 6\binom{n_{i}}{4}}\cdot{6\binom{n_{i}}{4}}\cdot\operatorname{tr}\left(4\cdot\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)=\operatorname{tr}\left(\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right),\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left({A_{i,3}}\right)&=\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1},\ell_{2}=1\\ \ell_{1}>\ell_{2}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{1},k_{2}=1\\ \ell_{2}\neq\ell_{1}\neq k_{1}>k_{2}\neq\ell_{1}\neq\ell_{2}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1}^{\prime},\ell_{2}^{\prime}=1\\ \ell_{1}^{\prime}>\ell_{2}^{\prime}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{1}^{\prime},k_{2}^{\prime}=1\\ \ell_{2}^{\prime}\neq\ell_{1}^{\prime}\neq k_{1}^{\prime}>k_{2}^{\prime}\neq\ell_{1}^{\prime}\neq\ell_{2}^{\prime}\end{subarray}}^{n_{i}}\frac{\operatorname{{\it Cov}}\left(\left[{{\boldsymbol{Y}}_{i,\ell_{1},\ell_{2}}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,k_{1},k_{2}}\right]^{2}\hskip 1.42271pt\large{\textbf{;}}\hskip 1.42271pt\left[{{\boldsymbol{Y}}_{i,\ell_{1}^{\prime},\ell_{2}^{\prime}}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,k_{1}^{\prime},k_{2}^{\prime}}\right]^{2}\right)}{4^{2}\cdot 6^{2}\cdot\binom{n_{i}}{4}^{2}}\\[17.22217pt] &\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{6\binom{n_{i}}{4}-6\binom{n_{i}-4}{4}}{4^{2}\cdot 6\cdot\binom{n_{i}}{4}}\operatorname{{\it Var}}\left(\left[{{\boldsymbol{Y}}_{i,1,2}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,3,4}\right]^{2}\right)\\[8.61108pt] &\stackrel{{\scriptstyle\ref{QF4}}}{{=}}\frac{\binom{n_{i}}{4}-\binom{n_{i}-4}{4}}{16\binom{n_{i}}{4}}\mathcal{O}\left(\operatorname{tr}^{2}\left(\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right)\\[8.61108pt] &=\mathcal{O}\left(n_{i}^{-1}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right),\end{array}
\begin{array}[]{ll}{\mathbb{E}}\left(A_{i,r,2}\right)&=\frac{1}{4\cdot\binom{n_{i}}{2}\binom{n_{r}}{2}}{\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1},\ell_{2}=1\\ \ell_{1}>\ell_{2}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{1},k_{2}=1\\ k_{1}>k_{2}\end{subarray}}^{n_{r}}{\mathbb{E}}\left(\left[{{\boldsymbol{Y}}_{i,\ell_{1},\ell_{2}}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{r,k_{1},k_{2}}\right]^{2}\right)}\\ &\stackrel{{\scriptstyle\ref{QF4}}}{{=}}\frac{1}{4\cdot\binom{n_{i}}{2}\binom{n_{r}}{2}}\cdot\binom{n_{i}}{2}\cdot\binom{n_{r}}{2}\cdot\operatorname{tr}\left(4\cdot\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)=\operatorname{tr}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right),\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{2N^{2}}{n_{i}n_{r}}A_{i,r,2}\right)&=\frac{4N^{4}}{n_{i}^{2}n_{r}^{2}}\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1},\ell_{2}=1\\ \ell_{1}>\ell_{2}\end{subarray}}^{n_{1}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{1},k_{2}=1\\ k_{1}>k_{2}\end{subarray}}^{n_{2}}\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1}^{\prime},\ell_{2}^{\prime}=1\\ \ell_{1}^{\prime}>\ell_{2}^{\prime}\end{subarray}}^{n_{i}}\sum\limits_{\footnotesize\begin{subarray}{c}k_{1}^{\prime},k_{2}^{\prime}=1\\ k_{1}^{\prime}>k_{2}^{\prime}\end{subarray}}^{n_{r}}\frac{\operatorname{{\it Cov}}\left(\left[{{\boldsymbol{Y}}_{i,\ell_{1},\ell_{2}}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{r,k_{1},k_{2}}\right]^{2}\hskip 1.42271pt\large{\textbf{;}}\hskip 1.42271pt\left[{{\boldsymbol{Y}}_{i,\ell_{1}^{\prime},\ell_{2}^{\prime}}}^{\top}\boldsymbol{T}_{S}\hskip 0.71114pt{\boldsymbol{Y}}_{r,k_{1}^{\prime},k_{2}^{\prime}}\right]^{2}\right)}{16\cdot\binom{n_{i}}{2}^{2}\binom{n_{r}}{2}^{2}}\\[12.91663pt] &\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{4N^{4}}{n_{i}^{2}n_{r}^{2}}\frac{\binom{n_{i}}{2}\binom{n_{r}}{2}-\binom{n_{i}-2}{2}\binom{n_{r}-2}{2}}{16\cdot\binom{n_{i}}{2}\binom{n_{r}}{2}}\operatorname{{\it Var}}\left(\left[{{\boldsymbol{Y}}_{i,1,2}}^{\top}\boldsymbol{T}_{S}\hskip 0.71114pt{\boldsymbol{Y}}_{r,1,2}\right]^{2}\right)\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}\left(\frac{2N^{2}}{n_{i}n_{r}}A_{i,r,2}\right)}&\stackrel{{\scriptstyle\ref{QF4}}}{{\leq}}\frac{\binom{n_{i}}{2}\binom{n_{r}}{2}-\binom{n_{i}-2}{2}\binom{n_{r}-2}{2}}{\binom{n_{i}}{2}\binom{n_{r}}{2}}\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\frac{N}{n_{i}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\frac{N}{n_{r}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)\right)\\[6.45831pt] &\leq\mathcal{O}\left(\frac{1}{n_{\min}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\frac{N}{n_{i}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\frac{N}{n_{r}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)\right).\end{array}
*Finally, the conditions for have to be checked. With the expectation values from above we calculate
\begin{array}[]{ll}{\mathbb{E}}\left(A_{4}\right)&=\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}{\mathbb{E}}(A_{i,3})+2\sum\limits_{i=1}^{a}\sum\limits_{r=1,r<i}^{a}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}{\mathbb{E}}\left(A_{i,r,2}\right)\\[6.45831pt] &=\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}\operatorname{tr}\left(\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)+2\sum\limits_{i=1}^{a}\sum\limits_{r=1,r<i}^{a}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}\operatorname{tr}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right).\end{array}
To calculate the variances the following additional inequalities are needed:
\begin{array}[]{ll}\frac{\operatorname{{\it Var}}\left(\sum\limits_{i=1}^{a}\left(\frac{N}{n_{i}}\right)^{2}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}&=\frac{\sum\limits_{i=1}^{a}\operatorname{{\it Var}}\left(\left(\frac{N}{n_{i}}\right)^{2}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\\ &\leq\sum\limits_{i=1}^{a}\mathcal{O}\left(n_{i}^{-1}\right)\cdot\frac{\mathcal{O}\left({(\boldsymbol{T}_{W})_{ii}}^{4}\operatorname{tr}^{2}\left(\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\\ &\leq\mathcal{O}\left(\frac{1}{n_{\min}}\right)\frac{\mathcal{O}\left(\operatorname{tr}^{2}\left(\sum\limits_{i=1}^{a}{(\boldsymbol{T}_{W})_{ii}}^{2}\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}=\mathcal{O}\left(\frac{1}{n_{\min}}\right)\end{array}
*and
\begin{array}[]{ll}&\frac{\operatorname{{\it Var}}\left(2\sum\limits_{r<i\in{\mathbb{N}}_{a}}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\\[6.45831pt] \stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}&4\sum\limits_{i<r\in{\mathbb{N}}_{a}}\sum\limits_{h<g\in{\mathbb{N}}_{a}}\frac{\sqrt{\operatorname{{\it Var}}\left(\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}A_{i,r,2}\right)}\sqrt{\operatorname{{\it Var}}\left(\frac{N^{2}}{n_{h}n_{g}}{(\boldsymbol{T}_{W})_{gh}}A_{h,g,2}\right)}}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\\[8.61108pt] \end{array}
\begin{array}[]{ll}\leq&\left(\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}\frac{\sqrt{\mathcal{O}\left(\frac{1}{n_{\min}}\right)}{(\boldsymbol{T}_{W})_{ir}}^{2}\operatorname{tr}\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\boldsymbol{T}_{S}\frac{N}{n_{r}}\boldsymbol{\Sigma}_{r}\right)}{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)^{2}\\[8.61108pt] \leq&\mathcal{O}\left(\frac{1}{n_{\min}}\right)\left(\frac{\mathcal{O}\left(\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}{(\boldsymbol{T}_{W})_{ir}}^{2}\operatorname{tr}\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\boldsymbol{T}_{S}\frac{N}{n_{r}}\boldsymbol{\Sigma}_{r}\right)\right)}{\sum\limits_{i,r\in{\mathbb{N}}_{a}}{(\boldsymbol{T}_{W})_{ir}}^{2}\operatorname{tr}\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\frac{N}{n_{r}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)}\right)^{2}\leq\mathcal{O}\left(\frac{1}{n_{\min}}\right).\end{array}
*Together this leads to
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{A_{4}}{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)&\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\left[\sqrt{\frac{\operatorname{{\it Var}}\left(2\sum\limits_{r<i\in{\mathbb{N}}_{a}}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}+\sqrt{\frac{\operatorname{{\it Var}}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}\right]^{2}\\ &\left[\sqrt{\mathcal{O}\left(\frac{1}{n_{\min}}\right)}+\sqrt{\mathcal{O}\left(\frac{1}{n_{\min}}\right)}\right]^{2}=\mathcal{O}\left(\frac{1}{n_{\min}}\right)\end{array}
*and therefore is an unbiased and ratio-consistent estimator of .
Moreover, we want to stress that the zero sequences used as upper border for and do not depend on the number of groups or dimensions, so this estimators can be also used for increasing number of groups.
*With the expectation values and variances from the beginning it follows directly that are unbiased, ratio-consistent estimators of and .
It is worth to note that all of this estimators also consistent estimators which are even dimension-stable in the sense of [8]. ∎
For there exists a alternative form which can be implemented substantially more efficient and was considered in [9]. It is based on matrices of the form . Recalling that is the vector of ones and denotes the Hadamard-Schur-Product, it can be seen that
[TABLE]
For there also exists an alternative formula, which expands much longer, but is more efficient:
[TABLE]
To finally prove Theorem 3.2 (p.3.2) we need another lemma.
Lemma A.11:
For the previously defined estimators it holds for that
[TABLE]
- Proof:
*We know that
\begin{array}[]{ll}&{\mathbb{E}}\left(\frac{\sum_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}-\sum_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\operatorname{tr}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)}{\sqrt{2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)\right)}}\right)=\frac{\sum_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\left({\mathbb{E}}\left(A_{i,1}\right)-\operatorname{tr}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)}{\sqrt{2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}=0.\end{array}
*Thus,
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{\sum_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\left(A_{i,1}-\operatorname{tr}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)}{\sqrt{2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}\right)&=\frac{\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}\operatorname{{\it Var}}\left(A_{i,1}\right)}{2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\\[12.91663pt] &\stackrel{{\scriptstyle\text{Proof of \ref{Schae1}}}}{{\leq}}\mathcal{O}\left(\frac{1}{n_{\min}}\right)\frac{\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}\operatorname{tr}\left(\left(2\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)}{2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}=\mathcal{O}\left(\frac{1}{n_{\min}}\right).\end{array}
In the last step we used the fact that all terms are non-negative and applied the binomial theorem in the last inequality. It is a zero sequence which only depends on , so again with A.6 (p.A.6) the result is proved. ∎*
- Proof of Theorem 3.2 (p.3.2):
*From A.6 it follows for and independent of or that and therefore . Moreover, it also follows that and with A.11 we deduce .
*Thus, we can finally calculate the standardized quadratic form as
\begin{array}[]{ll}W_{N}&=\frac{Q_{N}-\sum_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}}{\sqrt{2A_{4}}}\\[4.30554pt] &=\left(\frac{Q_{N}-\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}{\sqrt{2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}-\frac{\sum_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}-\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}{\sqrt{2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}\right)\cdot\sqrt{\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}{A_{4}}}\\[10.76385pt] &=\left(\frac{Q_{N}-\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}{\sqrt{2\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}-o_{p}(1)\right)\cdot(1+o_{p}(1))\\[10.76385pt] &=\widetilde{W}_{N}+\widetilde{W}_{N}\cdot o_{p}(1)-o_{p}(1)-o_{p}(1)\cdot o_{p}(1).\end{array}
The last two parts converge in probability to zero, so also in distribution and with Slutzky converge in distribution to zero if one of the conditions of Theorem 3.1 is fulfilled. Thereby has asymptotical the same distribution as .∎
For large numbers of groups many estimators and and have to be calculated which leads to long computation time. In this cases it is better to again use subsamling-type estimators which leads to and therefore to .
Lemma A.12:
*With the definitions from above let be
\begin{array}[]{l}A_{i,1}^{\star}(B)\hskip 5.69046pt=\frac{1}{2\cdot B}\sum\limits_{b=1}^{B}{{\boldsymbol{Y}}_{i,\sigma_{i1}(b),\sigma_{i2}(b)}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{i,\sigma_{i1}(b),\sigma_{i2}(b)}},\end{array}
\begin{array}[]{l}A_{i,r,2}^{\star}(B)\hskip 0.28436pt=\frac{1}{4\cdot B}{\sum\limits_{b=1}^{B}\left[{{\boldsymbol{Y}}_{i,\sigma_{i1}(b),\sigma_{i2}(b)}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{r,\sigma_{r1}(b),\sigma_{r2}(b)}}\right]^{2}},\end{array}
\begin{array}[]{l}A_{i,3}^{\star}(B)\hskip 5.69046pt=\frac{1}{4\cdot B}{\sum\limits_{b=1}^{B}\left[{{\boldsymbol{Y}}_{i,\sigma_{i1}(b),\sigma_{i2}(b)}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{i,\sigma_{i3}(b),\sigma_{i4}(b)}}\right]^{2}},\end{array}
\begin{array}[]{l}A_{4}^{\star}(B)\hskip 11.38092pt=\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}\cdot A_{i,3}^{\star}(B)+2\sum_{i=1}^{a}\sum_{r=1,r<i}^{a}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}^{\star}(B).\end{array}
If , this estimators and have the same properties as and which were defined in 3.1 (p.3.1) .
- Proof:
*For , this lemma will be proved in detail. For all other terms only the major steps are shown.
*The unbiasedness is clear because the random variables have no influence on the number of terms of the sum and also the terms are identically distributed. Hence,
\begin{array}[]{ll}{\mathbb{E}}\left(A_{i,1}^{\star}(B)\right)&=\frac{1}{2\cdot B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left({{\boldsymbol{Y}}_{i,\sigma_{i1}(b),\sigma_{i2}(b)}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{i,\sigma_{i1}(b),\sigma_{i2}(b)}}\right)\\[8.61108pt] &=\frac{1}{2\cdot B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left({{\boldsymbol{Y}}_{i,1,2}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,1,2}\right)\stackrel{{\scriptstyle\ref{QF4}}}{{=}}\operatorname{tr}(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}).\end{array}
*The second part is more complicated. Let be the smallest -field which contains , so obvious is -measurable. Identical for and . Similar to the previous part, the distribution of the bilinear form does not depend on the index combination. Together with the independence of the normally distributed vectors and this leads to
\begin{array}[]{l}\operatorname{{\it Var}}\left({\mathbb{E}}\left(A_{i,1}^{\star}(B)\big{|}\mathcal{F}(\boldsymbol{\sigma}_{i}(B,2))\right)\right)=\operatorname{{\it Var}}\left(\operatorname{tr}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)=0.\end{array}
With A.9 (p.A.9) we thus obtain
\begin{array}[]{ll}\operatorname{{\it Var}}\left(A_{i,1}^{\star}(B)\right)&=0+{\mathbb{E}}\left(\operatorname{{\it Var}}\left(A_{i,1}^{\star}(B)|\mathcal{F}(\boldsymbol{\sigma}_{i}(B,2))\right)\right).\end{array}
*For the calculation of the conditional variance of the sum, it would be useful finding an upper bound that is based on the variance instead of calculate the covariances. To achieve this, we calculate the number of index combinations which leads to a covariance that is zero. This amount is non-deterministic and we recognize it contains the amount which was considered before.
Again not the amount is important but the number of elements which are contained in since the bilinear forms are identically distributed. Therefore the condition of the variance of the bilinear form disappears since the random indices have no influence on the variance. With the -measurability of it thus follows that
\begin{array}[]{ll}\operatorname{{\it Var}}\left(A_{i,1}^{\star}(B)\right)&=0+{\mathbb{E}}\left(\operatorname{{\it Var}}\left(A_{i,1}^{\star}(B)|\mathcal{F}(\boldsymbol{\sigma}_{i}(B,2))\right)\right)\\[4.30554pt] &\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{1}{4B^{2}}{\mathbb{E}}\left(\sum\limits_{(j,\ell)\in{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,(\boldsymbol{\sigma}_{i}(b,2)))}\operatorname{{\it Var}}\left({{\boldsymbol{Y}}_{i,\sigma_{i1}(j),\sigma_{i2}(j)}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{i,\sigma_{i1}(j),\sigma_{i2}(j)}}\big{|}\mathcal{F}(\boldsymbol{\sigma}_{i}(B,2))\right)\right)\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}\left(A_{i,1}^{\star}(B)\right)}&=\frac{1}{4B^{2}}{\mathbb{E}}\left(\sum\limits_{(j,\ell)\in{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,(\boldsymbol{\sigma}_{i}(b,2)))}\operatorname{{\it Var}}\left({{\boldsymbol{Y}}_{i,1,2}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{i,1,2}}\right)\right)\\[10.76385pt] &\stackrel{{\scriptstyle\ref{QF4}}}{{=}}\frac{{\mathbb{E}}\left(|{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,(\boldsymbol{\sigma}_{i}(b,2)))|\right)}{B^{2}}\cdot\frac{\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)}{4}\\[2.15277pt] &\stackrel{{\scriptstyle\ref{Menge}}}{{=}}\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{i}-2}{2}}{\binom{n_{i}}{2}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right).\end{array}
*The other values are calculated in a similar way.
\begin{array}[]{ll}{\mathbb{E}}\left(A_{i,r,2}^{\star}(B)\right)&=\frac{1}{4\cdot B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left(\left[{{\boldsymbol{Y}}_{i,\sigma_{i1}(b),\sigma_{i2}(b)}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{r,\sigma_{r1}(b),\sigma_{r2}(b)}}\right]^{2}\right)\\[8.61108pt] &=\frac{1}{4\cdot B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left(\left[{{\boldsymbol{Y}}_{i,1,2}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{r,1,2}\right]^{2}\right)\stackrel{{\scriptstyle\ref{QF4}}}{{=}}\operatorname{tr}(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}).\end{array}
\begin{array}[]{l}\operatorname{{\it Var}}\left({\mathbb{E}}\left(A_{i,r,2}^{\star}(B)|\mathcal{F}(\boldsymbol{\sigma}_{i}(B,2),\boldsymbol{\sigma}_{r}(B,2))\right)\right)=\operatorname{{\it Var}}\left(\operatorname{tr}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)\right)=0.\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left(A_{i,r,2}^{\star}(B)\right)&=0+{{\mathbb{E}}\left(\operatorname{{\it Var}}\left(A_{i,r,2}^{\star}(B)|\mathcal{F}(\boldsymbol{\sigma}_{i}(B),\boldsymbol{\sigma}_{r}(B,2))\right)\right)}\\[4.30554pt] &\leq\frac{{\mathbb{E}}\left(|{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,\boldsymbol{\sigma}_{i}(b,2),\boldsymbol{\sigma}_{r}(b,2))|\right)}{B^{2}}\cdot\operatorname{{\it Var}}\left(\left[{{\boldsymbol{Y}}_{i,1,2}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{r,1,2}}\right]^{2}\right)\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}\left(A_{i,r,2}^{\star}(B)\right)}&\stackrel{{\scriptstyle\ref{QF4}}}{{\leq}}\frac{{\mathbb{E}}\left(|{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,\boldsymbol{\sigma}_{i}(b,2),\boldsymbol{\sigma}_{r}(b,2))|\right)}{B^{2}}\cdot\mathcal{O}\left({\operatorname{tr}^{2}\left(\frac{N}{n_{i}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\frac{N}{n_{r}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)}\right)\\[6.45831pt] &\stackrel{{\scriptstyle\ref{Menge}}}{{=}}\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{i}-2}{2}}{\binom{n_{i}}{2}}\cdot\frac{\binom{n_{r}-2}{2}}{\binom{n_{r}}{2}}\right)\cdot\mathcal{O}\left({\operatorname{tr}^{2}\left(\frac{N}{n_{i}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\frac{N}{n_{r}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)}\right)\\[8.61108pt] &\leq\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-2}{2}^{2}}{\binom{n_{\min}}{2}^{2}}\right)\cdot\mathcal{O}\left({\operatorname{tr}^{2}\left(\frac{N}{n_{i}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\frac{N}{n_{r}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)}\right).\end{array}
\begin{array}[]{ll}{\mathbb{E}}\left(A_{i,3}^{\star}(B)\right)&=\frac{1}{4\cdot B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left(\left[{{\boldsymbol{Y}}_{i,\sigma_{i1}(b),\sigma_{i2}(b)}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{i,\sigma_{i3}(b),\sigma_{i4}(b)}}\right]^{2}\right)\\[8.61108pt] &=\frac{1}{4\cdot B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left(\left[{{\boldsymbol{Y}}_{i,1,2}}^{\top}\boldsymbol{T}_{S}{\boldsymbol{Y}}_{i,1,2}\right]^{2}\right)\stackrel{{\scriptstyle\ref{QF4}}}{{=}}\operatorname{tr}\left(\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right).\end{array}
\begin{array}[t]{l}\operatorname{{\it Var}}\left({\mathbb{E}}\left(A_{i,3}^{\star}(B)|\mathcal{F}(\boldsymbol{\sigma}_{i}(B,4))\right)\right)=\operatorname{{\it Var}}\left(\operatorname{tr}\left(\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right)=0.\end{array}
\begin{array}[t]{ll}\operatorname{{\it Var}}\left(A_{i,3}^{\star}(B)\right)&=0+{\mathbb{E}}\left(\operatorname{{\it Var}}\left(A_{i,3}^{\star}(B)|\mathcal{F}(\boldsymbol{\sigma}_{i}(B,4))\right)\right)\\[2.15277pt] &\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{1}{16B^{2}}{\mathbb{E}}\left(\sum\limits_{(j,\ell)\in{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,\boldsymbol{\sigma}_{i}(b,4))}\operatorname{{\it Var}}\left(\left[{{\boldsymbol{Y}}_{i,\sigma_{i1}(j),\sigma_{i2}(j)}}^{\top}\boldsymbol{T}_{S}{{\boldsymbol{Y}}_{i,\sigma_{i3}(j),\sigma_{i4}(j)}}\right]^{2}\Big{|}\mathcal{F}(\boldsymbol{\sigma}_{i}(B,4))\right)\right)\\[8.61108pt] &\stackrel{{\scriptstyle\ref{QF4}}}{{\leq}}\frac{{\mathbb{E}}\left(|{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,\boldsymbol{\sigma}_{i}(b,4))|\right)}{B^{2}}\cdot\frac{\mathcal{O}\left(\operatorname{tr}^{2}\left(\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right)}{16}\\[4.30554pt] &\stackrel{{\scriptstyle\ref{Menge}}}{{=}}\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{i}-4}{4}}{\binom{n_{i}}{4}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right).\end{array}
\begin{array}[]{l}{\mathbb{E}}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}^{\star}\right)=\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}{\mathbb{E}}\left(A_{i,1}^{\star}\right)=\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\operatorname{tr}\left(\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right).\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}A_{i,1}^{\star}}{\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\right)&=\frac{\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}\operatorname{{\it Var}}\left(A_{i,1}^{\star}\right)}{\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\\[-4.73611pt] &=\frac{\sum\limits_{i=1}^{a}{(\boldsymbol{T}_{W})_{ii}}^{2}\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{i}-2}{2}}{\binom{n_{i}}{2}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\right)\right)}{\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\\[5.59721pt] &\leq\frac{\sum\limits_{i=1}^{a}{(\boldsymbol{T}_{W})_{ii}}^{2}\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-2}{2}}{\binom{n_{\min}}{2}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\right)\right)}{\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\\[4.30554pt] &\leq\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-2}{2}}{\binom{n_{\min}}{2}}\right)\cdot\frac{\mathcal{O}\left(\operatorname{tr}^{2}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)\right)}{\operatorname{tr}^{2}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\\[4.30554pt] &=\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-2}{2}}{\binom{n_{\min}}{2}}\right)\cdot\mathcal{O}\left(1\right).\end{array}
*For the first factor is a zero sequence and therefore a ratio-consistent, unbiased estimator of
\begin{array}[]{ll}&{\mathbb{E}}\left(\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}^{\star}+\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}^{\star}\right)\\[10.76385pt] =&\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}{\mathbb{E}}\left(A_{i,3}^{\star}\right)+\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}{\mathbb{E}}\left(A_{i,r,2}^{\star}\right)=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right).\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}^{\star}}{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)&=\frac{\sum\limits_{i=1}^{a}\operatorname{{\it Var}}\left(\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}^{\star}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\\[-4.30554pt] &\leq\frac{\sum\limits_{i=1}^{a}{(\boldsymbol{T}_{W})_{ii}}^{4}\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{i}-4}{4}}{\binom{n_{i}}{4}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{2}\left(\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}\left(\frac{\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}^{\star}}{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)}&\leq\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-4}{4}}{\binom{n_{\min}}{4}}\right)\cdot\frac{\sum\limits_{i=1}^{a}{(\boldsymbol{T}_{W})_{ii}}^{4}\mathcal{O}\left(\operatorname{tr}^{2}\left(\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\\[-3.44444pt] &\leq\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-4}{4}}{\binom{n_{\min}}{4}}\right)\cdot\frac{\mathcal{O}\left(\operatorname{tr}^{2}\left(\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\right)^{2}\right)\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\\[8.61108pt] &\leq\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-4}{4}}{\binom{n_{\min}}{4}}\right)\cdot\mathcal{O}\left(1\right).\end{array}
\begin{array}[]{ll}&\operatorname{{\it Var}}\left(\frac{\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}^{\star}}{\operatorname{tr}\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)}\right)\\[10.76385pt] \leq&\left(\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}\frac{\sqrt{\operatorname{{\it Var}}\left(\frac{N^{2}}{n_{i}n_{j}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}^{\star}\right)}}{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)^{2}\end{array}
\begin{array}[]{ll}\leq&\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-2}{2}^{2}}{\binom{n_{\min}}{2}^{2}}\right)\cdot\left(\frac{\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}{(\boldsymbol{T}_{W})_{ir}}^{2}\sqrt{\mathcal{O}\left({\operatorname{tr}^{2}\left(\frac{N}{n_{i}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{i}\frac{N}{n_{r}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)}\right)}}{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)^{2}\\[10.76385pt] \leq&\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-2}{2}^{2}}{\binom{n_{\min}}{2}^{2}}\right)\cdot\left(\frac{\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}\mathcal{O}\left({(\boldsymbol{T}_{W})_{ir}}^{2}\operatorname{tr}\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\boldsymbol{T}_{S}\frac{N}{n_{r}}\boldsymbol{\Sigma}_{r}\right)\right)}{\sum\limits_{i,r\in{\mathbb{N}}_{a}}{(\boldsymbol{T}_{W})_{ir}}^{2}\operatorname{tr}\left(\boldsymbol{T}_{S}\frac{N}{n_{i}}\boldsymbol{\Sigma}_{i}\frac{N}{n_{r}}\boldsymbol{T}_{S}\boldsymbol{\Sigma}_{r}\right)}\right)^{2}\\[8.61108pt] \leq&\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-2}{2}^{2}}{\binom{n_{\min}}{2}^{2}}\right)\cdot\mathcal{O}(1).\end{array}
\begin{array}[]{ll}&\operatorname{{\it Var}}\left(\frac{\sum\limits_{i=1}^{a}\frac{N^{2}}{n_{i}^{2}}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}^{\star}+\sum\limits_{i\neq r\in{\mathbb{N}}_{a}}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}^{\star}}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)\\[12.91663pt] \stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}&\left[\sqrt{\frac{\operatorname{{\it Var}}\left(2\sum\limits_{r<i\in{\mathbb{N}}_{a}}\frac{N^{2}}{n_{i}n_{r}}{(\boldsymbol{T}_{W})_{ir}}^{2}A_{i,r,2}^{\star}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}+\sqrt{\frac{\operatorname{{\it Var}}\left(\sum\limits_{i=1}^{a}\frac{N}{n_{i}}{(\boldsymbol{T}_{W})_{ii}}^{2}A_{i,3}^{\star}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}}\right]^{2}\end{array}
\begin{array}[]{ll}\leq&\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-2}{2}^{2}}{\binom{n_{\min}}{2}^{2}}\right)\cdot\mathcal{O}(1).\end{array}
So again this is a zero sequence, and is an unbiased and dimensional stable (i.e. also ratio consistent) estimator of . ∎
A.3 Proofs of Section 4
Lemma A.13:
For
[TABLE]
[TABLE]
[TABLE]
we define
[TABLE]
*With this notation it follows that
*
- Proof:
Set
[TABLE]
*It then follows that
\begin{array}[]{ll}&{\mathbb{E}}\left(\boldsymbol{T}\boldsymbol{Z}_{(\ell_{3,1},\ell_{4,1},\dots,\ell_{3,a},\ell_{4,a})}\cdot{\boldsymbol{Z}_{(\ell_{3,1},\ell_{4,1},\dots,\ell_{3,a},\ell_{4,a})}}^{\top}\boldsymbol{T}^{\top}\right)\\[3.44444pt] =&{\mathbb{E}}\left(\left(\sqrt{2}\boldsymbol{T}\boldsymbol{V}_{N}^{1/2}\widetilde{Z}_{(\ell_{3,1},\ell_{4,1},\dots,\ell_{3,a},\ell_{4,a})}\right)\left(\sqrt{2}\boldsymbol{T}\boldsymbol{V}_{N}^{1/2}{\widetilde{\boldsymbol{Z}}_{(\ell_{3,1},\ell_{4,1},\dots,\ell_{3,a},\ell_{4,a})}}\right)^{\top}\right)\\[6.45831pt] =&2\boldsymbol{T}\boldsymbol{V}_{N}^{1/2}{\mathbb{E}}\left(\widetilde{\boldsymbol{Z}}_{(\ell_{3,1},\ell_{4,1},\dots,\ell_{3,a},\ell_{4,a})}{{\widetilde{\boldsymbol{Z}}}_{(\ell_{3,1},\ell_{4,1},\dots,\ell_{3,a},\ell_{4,a})}}^{\top}\right){\boldsymbol{V}_{N}^{1/2}}^{\top}\boldsymbol{T}\\[4.30554pt] =&2\boldsymbol{T}\boldsymbol{V}_{N}^{1/2}\boldsymbol{I}_{ad}{\boldsymbol{V}_{N}^{1/2}}^{\top}\boldsymbol{T}=2\boldsymbol{T}\boldsymbol{V}_{N}\boldsymbol{T}.\end{array}
*With the rules for conditional expectation and the involved independence it follows that
\begin{array}[]{ll}{\mathbb{E}}\left(C_{5}\right)&=\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1,1},\dots,\ell_{6,1}=1\\ \ell_{1,1}\neq\dots\neq\ell_{6,1}\end{subarray}}^{n_{1}}\dots\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1,a},\dots,\ell_{6,a}=1\\ \ell_{1,a}\neq\dots\neq\ell_{6,a}\end{subarray}}^{n_{a}}\frac{{\mathbb{E}}\left(\Lambda_{1}(\ell_{1,1},\dots,\ell_{6,a})\cdot\Lambda_{2}(\ell_{1,1},\dots,\ell_{6,a})\cdot\Lambda_{3}(\ell_{1,1},\dots,\ell_{6,a})\right)}{8\cdot\prod\limits_{i=1}^{a}\frac{n_{i}!}{\left(n_{i}-6\right)!}}\\[12.91663pt] &=\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1,1},\dots,\ell_{6,1}=1\\ \ell_{1,1}\neq\dots\neq\ell_{6,1}\end{subarray}}^{n_{1}}\dots\sum\limits_{\footnotesize\begin{subarray}{c}\ell_{1,a},\dots,\ell_{6,a}=1\\ \ell_{1,a}\neq\dots\neq\ell_{6,a}\end{subarray}}^{n_{a}}\frac{{\mathbb{E}}\left({\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\cdot{\boldsymbol{Z}_{(3,4)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(5,6)}\cdot{\boldsymbol{Z}_{(5,6)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(1,2)}\right)}{8\cdot\prod\limits_{i=1}^{a}\frac{n_{i}!}{\left(n_{i}-6\right)!}}\\[17.22217pt] \vspace{0.15cm}&=\frac{1}{8}{\mathbb{E}}\left({\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\cdot{\boldsymbol{Z}_{(3,4)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(5,6)}\cdot{\boldsymbol{Z}_{(5,6)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(1,2)}\right)\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}{\mathbb{E}}\left(C_{5}\right)}&=\frac{1}{8}{\mathbb{E}}\left({\mathbb{E}}\left({\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\cdot{\boldsymbol{Z}_{(3,4)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(5,6)}\cdot{\boldsymbol{Z}_{(5,6)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(1,2)}\bigm{|}\boldsymbol{Z}_{(1,2)}\right)\right)\par\\[4.30554pt] &=\frac{1}{8}{\mathbb{E}}\left({\boldsymbol{Z}_{(1,2)}}^{\top}{\mathbb{E}}\left(\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\cdot{\boldsymbol{Z}_{(3,4)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(5,6)}\cdot{\boldsymbol{Z}_{(5,6)}}^{\top}\boldsymbol{T}\right)\boldsymbol{Z}_{(1,2)}\right)\\[3.44444pt] &=\frac{4}{8}{\mathbb{E}}\left({\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{V}_{N}\boldsymbol{T}\boldsymbol{T}\boldsymbol{V}_{N}\boldsymbol{T}\boldsymbol{Z}_{(1,2)}\right)=\frac{1}{2}\operatorname{tr}((\boldsymbol{T}\boldsymbol{V}_{N}\boldsymbol{T}\boldsymbol{T}\boldsymbol{V}_{N}\boldsymbol{T})2\boldsymbol{V}_{N})=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right).\end{array}
*Due to the fact that all are identically distributed we can neglect the concrete indices, as long as we maintain the structure of dependence of the bilinear forms. The last term fulfills the requirements from A.5 (p.A.5) with and the matrix .
*For the calculation of the variance it is useful to diagonalize the matrix : It exists an orthogonal matrix with , where are the eigenvalues of . We define so with the properties of the standard normal distribution , where the are independent for different indices. Thus, we can rewrite
\begin{array}[]{l}{\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}={\widetilde{\boldsymbol{Z}}_{(1,2)}}^{\top}2{\boldsymbol{V}_{N}^{1/2}}^{\top}\boldsymbol{T}\boldsymbol{V}_{N}^{1/2}\widetilde{\boldsymbol{Z}}_{(3,4)}=2{\widetilde{\boldsymbol{Z}}_{(1,2)}}^{\top}\boldsymbol{P}^{\top}\boldsymbol{D}\boldsymbol{P}\widetilde{\boldsymbol{Z}}_{(3,4)}=2\boldsymbol{E}_{1}^{\top}\boldsymbol{D}\boldsymbol{E}_{3}.\end{array}
With this argument for all three random variables it follows for the second moment that
\begin{array}[]{ll}&{\mathbb{E}}\left(\left[\boldsymbol{E}_{1}^{\top}\boldsymbol{D}\boldsymbol{E}_{3}\boldsymbol{E}_{3}^{\top}\boldsymbol{D}\boldsymbol{E}_{5}\boldsymbol{E}_{5}^{\top}\boldsymbol{D}\boldsymbol{E}_{1}\right]^{2}\right)\\[5.16663pt] =&{\mathbb{E}}\left(\left[\sum_{i=1}^{ad}\lambda_{i}E_{1}^{(i)}E_{3}^{(i)}\right]^{2}\left[\sum_{j=1}^{ad}\lambda_{j}E_{3}^{(j)}E_{5}^{(j)}\right]^{2}\left[\sum_{h=1}^{ad}\lambda_{h}E_{5}^{(h)}E_{1}^{(h)}\right]^{2}\right)\\[6.45831pt] =&\sum\limits_{i_{1},i_{2},j_{1},j_{2},h_{1},h_{2}=1}^{ad}\lambda_{i_{1}}\lambda_{i_{2}}\lambda_{j_{1}}\lambda_{j_{2}}\lambda_{h_{1}}\lambda_{h_{2}}{\mathbb{E}}\left(E_{1}^{(i_{1})}E_{3}^{(i_{1})}E_{1}^{(i_{2})}E_{3}^{(i_{2})}E_{3}^{(j_{1})}E_{5}^{(j_{1})}E_{3}^{(j_{2})}E_{5}^{(j_{2})}E_{5}^{(h_{1})}E_{1}^{(h_{1})}E_{5}^{(h_{2})}E_{1}^{(h_{2})}\right).\end{array}
Now we consider the expectation value for the different combinations. If all indices are equal, it is given by
[TABLE]
Moreover, for and it holds that
[TABLE]
Next, the case is considered (noting this result can also be used for both analogue combinations):
[TABLE]
Finally, we consider the combination and obtain
[TABLE]
*This is also true for and the analogue combinations, so, all in all, we have 4 combinations of this kind. All other index combinations lead to expectation zero because in this combinations at least one index appears just one time in the product. Therefore with the independence and the fact that all random variables are centered it is true that
\begin{array}[]{ll}&{\mathbb{E}}\left(\left[\boldsymbol{E}_{1}^{\top}\boldsymbol{D}\boldsymbol{E}_{3}\boldsymbol{E}_{3}^{\top}\boldsymbol{D}\boldsymbol{E}_{5}\boldsymbol{E}_{5}^{\top}\boldsymbol{D}\boldsymbol{E}_{1}\right]^{2}\right)\\[8.61108pt] =&\sum\limits_{i=1}^{ad}\lambda_{i}^{6}\cdot 27+\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{ad}\lambda_{i}^{3}\lambda_{j}^{3}\cdot 1\cdot 4+\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{d}\lambda_{i}^{2}\lambda_{j}^{4}\cdot 9+\sum\limits_{\footnotesize\begin{subarray}{c}i,j,h=1\\ i\neq j\neq h\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{2}\lambda_{h}^{2}\\[12.91663pt] =&23\sum\limits_{i=1}^{ad}\lambda_{i}^{6}+4\left(\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{ad}\lambda_{i}^{3}\lambda_{j}^{3}+\sum\limits_{i=j=1}^{ad}\lambda_{i}^{3}\lambda_{j}^{3}\right)+9\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{4}+\sum\limits_{\footnotesize\begin{subarray}{c}i,j,h=1\\ i\neq j\neq h\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{2}\lambda_{h}^{2}\end{array}
\begin{array}[]{ll}=&17\sum\limits_{i=1}^{ad}\lambda_{i}^{6}+4\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)+3\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{4}+6\left(\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{4}+\sum\limits_{i=j=1}^{ad}\lambda_{i}^{2}\lambda_{j}^{4}\right)+\sum\limits_{\footnotesize\begin{subarray}{c}i,j,h=1\\ i\neq j\neq h\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{2}\lambda_{h}^{2}\\[12.91663pt] =&17\sum\limits_{i=1}^{ad}\lambda_{i}^{6}+4\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)+3\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{4}+6\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)+\sum\limits_{\footnotesize\begin{subarray}{c}i,j,h=1\\ i\neq j\neq h\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{2}\lambda_{h}^{2}\\[12.91663pt] \stackrel{{\scriptstyle\ref{Spur1}}}{{\leq}}&17\sum\limits_{i=1}^{ad}\lambda_{i}^{6}+4\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)+3\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{4}+6\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)+\sum\limits_{\footnotesize\begin{subarray}{c}i,j,h=1\\ i\neq j\neq h\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{2}\lambda_{h}^{2}\end{array}
\begin{array}[]{ll}\stackrel{{\scriptstyle\ref{Spur1}}}{{\leq}}&20\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)+6\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)+\left(\sum\limits_{\footnotesize\begin{subarray}{c}i,j,h=1\\ i\neq j\neq h\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{2}\lambda_{h}^{2}+3\sum\limits_{\footnotesize\begin{subarray}{c}i,j=1\\ i\neq j\end{subarray}}^{ad}\lambda_{i}^{2}\lambda_{j}^{4}+\sum\limits_{i=1}^{ad}\lambda_{i}^{6}\right)\\[10.76385pt] =&20\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)+7\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\\ \stackrel{{\scriptstyle\ref{Spur1}}}{{\leq}}&20\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)+7\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\\ \stackrel{{\scriptstyle\ref{Spur1}}}{{\leq}}&27\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right).\end{array}
*So we can control the variance by
\begin{array}[]{ll}\operatorname{{\it Var}}(C_{5})&\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{\operatorname{{\it Var}}\left(\Lambda_{1}(1,2,3,4,5,6,\dots,5,6)\cdot\Lambda_{2}(1,2,3,4,5,6,\dots,5,6)\cdot\Lambda_{3}(1,2,3,4,5,6,\dots,5,6)\right)}{{64\cdot\prod\limits_{i=1}^{a}\binom{n_{i}}{6}}\cdot{\left(\prod\limits_{i=1}^{a}\binom{n_{i}}{6}-\prod\limits_{i=1}^{a}\binom{n_{i}-6}{6}\right)^{-1}}}\\[12.91663pt] &{\leq}\frac{{\mathbb{E}}\left(\left[\Lambda_{1}(1,2,3,4,5,6,\dots,5,6)\cdot\Lambda_{2}(1,2,3,4,5,6,\dots,5,6)\cdot\Lambda_{3}(1,2,3,4,5,6,\dots,5,6)\right]^{2}\right)}{{64\cdot\prod\limits_{i=1}^{a}\binom{n_{i}}{6}}\cdot{\left(\prod\limits_{i=1}^{a}\binom{n_{i}}{6}-\prod\limits_{i=1}^{a}\binom{n_{i}-6}{6}\right)^{-1}}}\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}(C_{5})}&=\frac{{\mathbb{E}}\left(\left[2^{3}\cdot\boldsymbol{E}_{1}^{\top}\boldsymbol{D}\boldsymbol{E}_{3}\boldsymbol{E}_{3}^{\top}\boldsymbol{D}\boldsymbol{E}_{5}\boldsymbol{E}_{5}^{\top}\boldsymbol{D}\boldsymbol{E}_{1}\right]^{2}\right)}{{64\cdot\prod\limits_{i=1}^{a}\binom{n_{i}}{6}}\cdot{\left(\prod\limits_{i=1}^{a}\binom{n_{i}}{6}-\prod\limits_{i=1}^{a}\binom{n_{i}-6}{6}\right)^{-1}}}\\[15.0694pt] &\leq\frac{\left(\prod\limits_{i=1}^{a}{n_{i}\choose 6}-\prod\limits_{i=1}^{a}\binom{n_{i}-6}{6}\right)}{\prod\limits_{i=1}^{a}\binom{n_{i}}{6}}\cdot 27\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right).\end{array}
∎*
With this result, we can construct an estimator for step by step:
Lemma A.14:
For as previously defined, it holds for fixed that
[TABLE]
It even holds in the asymptotic frameworks (4)-(5) if exists with .
- Proof:
*From the previous lemma, we know that
\begin{array}[]{ll}{\mathbb{E}}\left(\frac{C_{5}}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)&={\mathbb{E}}\left(\frac{C_{5}}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}=0,\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{C_{5}}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)&=\frac{\operatorname{{\it Var}}(C_{5})}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\stackrel{{\scriptstyle\ref{MSchae2}}}{{\leq}}27\cdot\frac{\left(\prod\limits_{i=1}^{a}{n_{i}\choose 6}-\prod\limits_{i=1}^{a}\binom{n_{i}-6}{6}\right)}{\prod\limits_{i=1}^{a}\binom{n_{i}}{6}}.\end{array}
*For fixed this is a zero sequence. If we consider we need the existence of and to guarantee that the upper border is a zero sequence.
So in both cases A.6 (p.A.6) can be used. ∎*
Lemma A.15:
Moreover holds for fixed
[TABLE]
If exists with , the convergence even holds in the asymptotic frameworks (4)-(5).
- Proof:
*With the last lemma it follows for both cases that
\begin{array}[]{ll}\frac{C_{5}^{2}}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\tau_{P}&=\left(\frac{C_{5}}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)^{2}-\left(\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)^{2}\\[8.61108pt] &\vspace{.25cm}=\left[\frac{C_{5}}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right]\left[\frac{C_{5}}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}+\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right]\end{array}
\begin{array}[]{ll}\vspace{.25cm}{\color[rgb]{1,1,1}\frac{C_{5}^{2}}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\tau_{P}}&=o_{P}(1)\cdot\left[\frac{C_{5}}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}+2\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right]\\[5.16663pt] &=o_{P}(1)\cdot\left[o_{P}(1)+2\cdot\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right]=o_{P}(1).\end{array}
For the last step we used that which is known from A.8 (p.A.8) and hence . As a product of a bound term and a term which converges to zero in probability, it also converges to zero in probability and with Slutzky’s Lemma the result follows. ∎*
- Proof of 4.1 :
From 3.1 (p.3.1) together with A.6 (p.A.6) it follows
[TABLE]
*independent of or . With A.15 (p.A.15) it follows
[TABLE]
*or under the additional condition also in the asymptotic frameworks (4) -(5) .
*With these limits in both cases we can calculate
\begin{array}[]{ll}\frac{C_{5}^{2}}{A_{4}^{3}}-\tau_{P}&=\frac{C_{5}^{2}}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\cdot\frac{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}{A_{4}^{3}}-\tau_{P}\\[5.16663pt] &=\frac{C_{5}^{2}}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\cdot(1+o_{P}(1))-\tau_{P}\\[5.16663pt] &=\frac{C_{5}^{2}}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\tau_{P}+\left(\frac{C_{5}^{2}}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\tau_{P}+\tau_{P}\right)\cdot o_{P}(1)\\[7.74998pt] &=o_{P}(1)+o_{P}(1)\cdot o_{P}(1)+\tau_{P}\cdot o_{P}(1)=o_{P}(1).\end{array}
As in the previous lemma we used and Slutzky. ∎*
For the properties are shown in a similar way as in A.12 (p.A.12).
Lemma A.16:
For
[TABLE]
[TABLE]
[TABLE]
define
[TABLE]
Then it holds
[TABLE]
- Proof:
*With the same steps as in the previous lemma and by using the fact that expectation and variance do not depend on the concrete indices but rather on the structure of independences we get
\begin{array}[]{ll}{\mathbb{E}}\left({C_{5}^{\star}}(B)\right)\par&=\frac{1}{8B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left(\Lambda_{1}(\boldsymbol{\sigma}(b,6))\cdot\Lambda_{2}(\boldsymbol{\sigma}(b,6))\cdot\Lambda_{3}(\boldsymbol{\sigma}(b,6))\right)\\[7.74998pt] &=\frac{1}{8B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left(\Lambda_{1}(\ell_{1,1},\dots,\ell_{6,a})\cdot\Lambda_{2}(\ell_{1,1},\dots,\ell_{6,a})\cdot\Lambda_{3}(\ell_{1,1},\dots,\ell_{6,a})\right).\\[7.74998pt] &\stackrel{{\scriptstyle\ref{MSchae2}}}{{=}}\frac{1}{8B}\sum\limits_{b=1}^{B}\operatorname{tr}\left(\left(2\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right).\end{array}
\begin{array}[]{l}\operatorname{{\it Var}}\left({\mathbb{E}}\left({C_{5}^{\star}}(B)|\mathcal{F}(\boldsymbol{\sigma}(B,6))\right)\right)=\operatorname{{\it Var}}\left(\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)\right)=0.\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left({C_{5}^{\star}}(B)\right)&=0+{\mathbb{E}}\left(\operatorname{{\it Var}}\left({C_{5}^{\star}}(B)|\mathcal{F}(\boldsymbol{\sigma}(B,6))\right)\right)\\[4.30554pt] &\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{1}{64B^{2}}{\mathbb{E}}\left(\sum\limits_{(j,\ell)\in{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,\boldsymbol{\sigma}(b,6))}\operatorname{{\it Var}}\left(\Lambda_{1}(\boldsymbol{\sigma}(j,6))\Lambda_{2}(\boldsymbol{\sigma}(j,6))\Lambda_{3}(\boldsymbol{\sigma}(j,6))|\mathcal{F}(\boldsymbol{\sigma}(B,6))\right)\right)\\[10.33327pt] &=\frac{{\mathbb{E}}\left(|{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,\boldsymbol{\sigma}(b,6))|\right)}{B^{2}}\cdot\frac{\operatorname{{\it Var}}\left({\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\cdot{\boldsymbol{Z}_{(3,4)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(5,6)}\cdot{\boldsymbol{Z}_{(5,6)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(1,2)}\right)}{64}\\[4.30554pt] &\stackrel{{\scriptstyle\ref{MSchae2}}}{{\leq}}\left(1-\left(1-\frac{1}{B}\right)\cdot\prod\limits_{i=1}^{a}\frac{\binom{n_{i}-6}{6}}{\binom{n_{i}}{6}}\right)\cdot 27\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right).\end{array}
∎
- Proof of Theorem 4.2 (p.4.2):
With A.16 we recognize and . Therefore and . This is the only condition needed for the proof of [31][Theorem 3.1], so the result follows. ∎
Although with is not too critical in most settings we additionally developed an estimator which can be used without any restrictions.
For this estimator another random vector has to be introduced: The random vector represents a random permutation of the numbers where are independent for different or and denotes its -th element. Then we define
[TABLE]
with
[TABLE]
and
[TABLE]
This estimator again uses Z, but different to the indices are the same for all groups. However the highest index is and some index combinations are unachievable. For this reason, the above random permutations were used. So first the observations in each group were rearranged randomly and with this rearranged samples we calculated the sum of the used terms. Thereafter, we again rearrange the observations and the same terms as before are calculated. If these values were summed up and divided by the number of rearrangements we get an alternative for which is shown in the following lemma.
Lemma A.17:
*For as defined before it holds *
[TABLE]
- Proof:
*Again we calculate
\begin{array}[]{ll}{\mathbb{E}}\left(C_{7}\left(w\right)\right)&=\frac{1}{w}\sum\limits_{j=1}^{w}\sum\limits_{\ell_{1}\neq\dots\neq\ell_{6}=1}^{n_{\min}}\frac{{\mathbb{E}}\left(\Lambda_{4}\left(j;\ell_{1},\dots,\ell_{6}\right)\cdot\Lambda_{5}\left(j;\ell_{1},\dots,\ell_{6}\right)\cdot\Lambda_{6}\left(j;\ell_{1},\dots,\ell_{6}\right)\right)}{8\cdot\frac{n_{\min}!}{\left(n_{\min}-6\right)!}}\\[10.76385pt] &=\frac{1}{w}\sum\limits_{j=1}^{w}\sum\limits_{\ell_{1}\neq\dots\neq\ell_{6}=1}^{n_{\min}}\frac{{\mathbb{E}}\left(\Lambda_{4}\left(j;1,\dots,6\right)\cdot\Lambda_{5}\left(j;1,\dots,6\right)\cdot\Lambda_{6}\left(j;1,\dots,6\right)\right)}{8\cdot\frac{n_{\min}!}{\left(n_{\min}-6\right)!}}=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right).\end{array}
Because of the fact that all groups use the same indices, the number of remaining indexcombinations simplifies and we receive*
[TABLE]
*For the sum this leads to
\begin{array}[]{ll}\operatorname{{\it Var}}\left(C_{7}\left(w\right)\right)&=\operatorname{{\it Var}}\left(\frac{1}{w}\sum\limits_{j=1}^{w}\sum\limits_{\ell_{1}\neq\dots\neq\ell_{6}=1}^{n_{\min}}\frac{\Lambda_{4}\left(j;\ell_{1},\dots,\ell_{6}\right)\cdot\Lambda_{5}\left(j;\ell_{1},\dots,\ell_{6}\right)\cdot\Lambda_{6}\left(j;\ell_{1},\dots,\ell_{6}\right)}{8\cdot\frac{n_{\min}!}{\left(n_{\min}-6\right)!}}\right)\\[10.76385pt] &\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{1}{w^{2}}\sum\limits_{j_{1},j_{2}=1}^{w}\operatorname{{\it Var}}\left(\sum\limits_{\ell_{1}\neq\dots\neq\ell_{6}=1}^{n_{\min}}\frac{\Lambda_{4}\left(j;\ell_{1},\dots,\ell_{6}\right)\cdot\Lambda_{5}\left(j;\ell_{1},\dots,\ell_{6}\right)\cdot\Lambda_{6}\left(j;\ell_{1},\dots,\ell_{6}\right)}{8\cdot\frac{n_{\min}!}{\left(n_{\min}-6\right)!}}\right)\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}\left(C_{7}\left(w\right)\right)}&\leq\frac{1}{w^{2}}\sum\limits_{j_{1},j_{2}=1}^{w}\left(\frac{\frac{n_{\min}!}{\left(n_{\min}-6\right)!}-\frac{\left(n_{\min}-6\right)!}{\left(n_{\min}-12\right)!}}{\frac{n_{\min}!}{\left(n_{\min}-6\right)!}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right)\\[10.76385pt] &=\left(\frac{\frac{n_{\min}!}{\left(n_{\min}-6\right)!}-\frac{\left(n_{\min}-6\right)!}{\left(n_{\min}-12\right)!}}{\frac{n_{\min}!}{\left(n_{\min}-6\right)!}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right).\end{array}
∎*
Simulations (not shown here) show that higher values for lead to better estimations.
Lemma A.18:
For as previously defined, it holds
[TABLE]
independent of a or d. Therefore this holds for the asymptotic frameworks (3)-(5).
- Proof:
*With the previous lemma we know
\begin{array}[]{ll}{\mathbb{E}}\left(\frac{C_{7}(w)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)&={\mathbb{E}}\left(\frac{C_{7}(w)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}=0,\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{C_{7}(w)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}{\operatorname{tr}^{3/2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)&=\frac{\operatorname{{\it Var}}\left(C_{7}(w)\right)}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}{\leq}\left(\frac{\frac{n_{\min}!}{\left(n_{\min}-6\right)!}-\frac{\left(n_{\min}-6\right)!}{\left(n_{\min}-12\right)!}}{\frac{n_{\min}!}{\left(n_{\min}-6\right)!}}\right)\cdot\mathcal{O}\left(1\right).\end{array}
So exactly the same steps as in the proof of 4.1 , which in this case uses that the zero sequence not depends on or , leads to the result. ∎*
But for the calculation of this estimator we need summations. Thus, a subsampling-type version of is necessary which is now defined.
Lemma A.19:
*For each we independently draw random subsamples of length from and define
which holds*
[TABLE]
- Proof:
*The proof for this subsampling-type estimator takes the same steps as before, with another amount . At the beginning we calculate expectation value and an upper bound for the variance of the inner sum. We get
\begin{array}[]{ll}&{\mathbb{E}}\left(\sum\limits_{b=1}^{B}\frac{\Lambda_{4}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{5}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{6}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)}{8B}\right)\\[10.76385pt] =&\sum\limits_{b=1}^{B}\frac{{\mathbb{E}}\left(\Lambda_{4}\left(j;1,\dots,6\right)\cdot\Lambda_{5}\left(j;1,\dots,6\right)\cdot\Lambda_{6}\left(j;1,\dots,6\right)\right)}{8B}=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right).\end{array}
\begin{array}[]{l}\operatorname{{\it Var}}\left({\mathbb{E}}\left(\sum\limits_{b=1}^{B}\frac{\Lambda_{4}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{5}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{6}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)}{8B}\Big{\lvert}\mathcal{F}\left(\boldsymbol{\sigma}_{0}(B)\right)\right)\right)=\operatorname{{\it Var}}\left(\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)\right)=0.\par\end{array}
\begin{array}[]{ll}&\operatorname{{\it Var}}\left(\sum\limits_{b=1}^{B}\frac{\Lambda_{4}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{5}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{6}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)}{8B}\right)\\[8.61108pt] =&0+{\mathbb{E}}\left(\operatorname{{\it Var}}\left(\sum\limits_{b=1}^{B}\frac{\Lambda_{4}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{5}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{6}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)}{8B}\right)\Big{\lvert}\mathcal{F}\left(\boldsymbol{\sigma}_{0}(B)\right)\right)\end{array}
\begin{array}[]{ll}=&\frac{{\mathbb{E}}\left(|{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M\left(B,\boldsymbol{\sigma}_{0}(b,6)\right)|\right)}{B^{2}}\cdot\frac{\operatorname{{\it Var}}\left(\Lambda_{4}\left(j;{1},\dots,{6}\right)\cdot\Lambda_{5}\left(j;{1},\dots,{6}\right)\cdot\Lambda_{6}\left(j;{1},\dots,{6}\right)\right)}{64}\\[4.30554pt] \stackrel{{\scriptstyle\ref{MSchae2}}}{{\leq}}&\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-6}{6}}{\binom{n_{\min}}{6}}\right)\cdot 27\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right).\end{array}
*With these values we can consider the whole estimator
\begin{array}[]{ll}{\mathbb{E}}\left({C_{7}^{\star}}\left(w,B\right)\right)&=\frac{1}{w}\sum\limits_{j=1}^{w}{\mathbb{E}}\left(\sum\limits_{b=1}^{B}\frac{\Lambda_{4}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{5}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{6}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)}{8\cdot B}\right)=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right),\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left({C_{7}^{\star}}(w,B)\right)&\leq\frac{1}{w^{2}}\left(\sum\limits_{j=1}^{w}\sqrt{\operatorname{{\it Var}}\left(\sum\limits_{b=1}^{B}\frac{\Lambda_{4}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{5}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)\cdot\Lambda_{6}\left(j;\boldsymbol{\sigma}_{0}(b,6)\right)}{8B}\right)}\right)^{2}\\[10.76385pt] &\leq\frac{1}{w^{2}}\left(\sum\limits_{j=1}^{w}\sqrt{\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-6}{6}}{\binom{n_{\min}}{6}}\right)\cdot 27\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)^{2}\\[10.76385pt] &=\left(1-\left(1-\frac{1}{B}\right)\cdot\frac{\binom{n_{\min}-6}{6}}{\binom{n_{\min}}{6}}\right)\cdot 27\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right).\end{array}
∎
The next lemma shows that the version of the estimators with random indices has all the properties the classical ones possess.
Lemma A.20:
*The statements of A.11, A.14, A.15, 4.1 and A.18 are also true, if all or only a part of the estimators are replaced by the subsampling-type estimators.
Moreover, Theorem 3.1 , Theorem 3.2 and Theorem 4.2 hold, if all or only a part of the estimators are replaced by the subsampling-type estimators.*
- Proof:
*For the proofs of the classical estimators from the first paragraph, only the expectation values are used together with upper bounds for the variances which are zero sequences. With random indices, the expectation is the same and for the variance, all traces are the same but the zero sequence changes. So the proofs of the subsampling-type estimators work identically.
For the second paragraph, only some convergences are necessary, which the subsampling-type estimators also fulfills. ∎*
A.4 On the asymptotic distribution in our simulation designs
To chose the convenient test for our simulation the limit of has to be considered. Instead of this we calculate the value of \tau_{P}={\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{3}\right)}\Big{/}{\operatorname{tr}^{3}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)} and because is known no estimation is needed. The ratio and are the same for all our sample sizes, so the different numbers has no influence on the values of . The results can be seen in Table 3 and Table 4 which leads to the assumption for and for . With A.8 (p.A.8) this is equivalent to under resp. under .
A.5 On the Chen-Qui-Condition
We can also develop an estimator for on an analogical way as before. This leads to:
Lemma A.21:
Let be
[TABLE]
*with
\begin{array}[]{ll}\Lambda_{7}(\ell_{1,1},\dots,\ell_{8,a})&=\left[\boldsymbol{Z}_{(\ell_{1,1},\ell_{2,1},\dots,\ell_{1,a},\ell_{2,a})}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(\ell_{3,1},\ell_{4,1},\dots,\ell_{3,a},\ell_{4,a})}\right]^{4},\\ \Lambda_{8}(\ell_{1,1},\dots,\ell_{8,a})&=\left[\boldsymbol{Z}_{(\ell_{1,1},\ell_{2,1},\dots,\ell_{1,a},\ell_{2,a})}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(\ell_{3,1},\ell_{4,1},\dots,\ell_{3,a},\ell_{4,a})}\right]^{2}\cdot\left[Z_{(\ell_{5,1},\ell_{6,1},\dots,\ell_{5,a},\ell_{6,a})}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(\ell_{7,1},\ell_{8,1},\dots,\ell_{7,a},\ell_{8,a})}\right]^{2}.\end{array}
Then we know*
[TABLE]
- Proof:
\begin{array}[]{ll}{\mathbb{E}}(C_{6})&=\frac{{\mathbb{E}}\left(\left[\boldsymbol{Z}_{(1,2)}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\right]^{4}\right)}{6\cdot 16}-\frac{{\mathbb{E}}\left(\left[\boldsymbol{Z}_{(1,2)}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\right]^{2}\left[\boldsymbol{Z}_{(5,6)}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(7,8)}\right]^{2}\right)}{2\cdot 16}\\[4.30554pt] &\stackrel{{\scriptstyle\ref{QF3}}}{{=}}\frac{1}{6\cdot 16}\left(6\operatorname{tr}\left(\left(2\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)+3\operatorname{tr}^{2}\left(\left(2\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right)-\frac{1}{2\cdot 16}\operatorname{tr}^{2}\left(\left(2\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)\end{array}
*For the second inequality, the variance of parts is calculated. Like before with A.2 (p.A.2) and A.4 (p.A.4) we calculate
*and
\begin{array}[]{ll}&\operatorname{{\it Var}}\left(\frac{1}{2}\left[{\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\right]^{2}\left[{\boldsymbol{Z}_{(5,6)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(7,8)}\right]^{2}\right)\\[6.45831pt] \leq&\frac{1}{4}\cdot{\mathbb{E}}\left(\left[{\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\right]^{4}\left[{\boldsymbol{Z}_{(5,6)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(7,8)}\right]^{4}\right)\\[6.45831pt] =&\frac{1}{4}\left(6\operatorname{tr}\left(\left(2\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)+3\operatorname{tr}^{2}\left(\left(2\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right)^{2}=\mathcal{O}\left(\operatorname{tr}^{4}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right).\end{array}
With A.7 (p.A.7) it is known
\begin{array}[]{ll}\operatorname{{\it Var}}(A-B)&\leq\operatorname{{\it Var}}(A)+\operatorname{{\it Var}}(B)+2|\operatorname{{\it Cov}}(A,B)|\leq\left(\sqrt{\operatorname{{\it Var}}(A)}+\sqrt{\operatorname{{\it Var}}(B)}\right)^{2}\end{array}
and therefore
\begin{array}[]{ll}\operatorname{{\it Var}}(C_{6})&\leq\frac{\prod\limits_{i=1}^{a}\binom{n_{i}}{8}-\prod\limits_{i=1}^{a}\binom{n_{i}-8}{8}}{16^{2}\cdot\prod\limits_{i=1}^{a}\binom{n_{i}}{8}}\operatorname{{\it Var}}\left(\frac{1}{6}\Lambda_{7}(1,\dots,8)-\frac{1}{2}\Lambda_{8}(1,\dots,8)\right)\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}(C_{6})}&\leq\frac{\prod\limits_{i=1}^{a}\binom{n_{i}}{8}-\prod\limits_{i=1}^{a}\binom{n_{i}-8}{8}}{16^{2}\cdot\prod\limits_{i=1}^{a}\binom{n_{i}}{8}}\left(\sqrt{\mathcal{O}\left(\operatorname{tr}^{4}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right)}+\sqrt{\mathcal{O}\left(\operatorname{tr}^{4}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right)}\right)^{2}\end{array}
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}(C_{6})}&=\frac{\prod\limits_{i=1}^{a}\binom{n_{i}}{8}-\prod\limits_{i=1}^{a}\binom{n_{i}-8}{8}}{16^{2}\cdot\prod\limits_{i=1}^{a}\binom{n_{i}}{8}}{\mathcal{O}\left(\operatorname{tr}^{4}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right)}.\end{array}
∎
Lemma A.22:
With the estimators introduced in the previous lemmata it holds for fixed
[TABLE]
If exists with , the convergence even holds in the asymptotic frameworks (4)-(5).
- Proof:
*Again we first consider the parts:
\begin{array}[]{ll}{\mathbb{E}}\left(\frac{C_{6}}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)=\frac{{\mathbb{E}}\left(C_{6}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}=0.\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left(\frac{C_{6}}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}-\frac{\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)}{\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\right)&\leq\frac{\prod\limits_{i=1}^{a}\binom{n_{i}}{8}-\prod\limits_{i=1}^{a}\binom{n_{i}-8}{8}}{16^{2}\cdot\prod\limits_{i=1}^{a}\binom{n_{i}}{8}}\frac{\mathcal{O}\left(\operatorname{tr}^{4}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right)}{\operatorname{tr}^{4}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\leq\frac{\prod\limits_{i=1}^{a}\binom{n_{i}}{8}-\prod\limits_{i=1}^{a}\binom{n_{i}-8}{8}}{\prod\limits_{i=1}^{a}\binom{n_{i}}{8}}\cdot\mathcal{O}(1).\end{array}
So with A.6 (p.A.6) for fixed and and moreover if the additional condition is fulfilled even for the asymptotic frameworks (4)-(5), it follows*
[TABLE]
*Analogue to the proof of 4.1 it follows {\operatorname{tr}^{2}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)}\Big{/}{A_{4}^{2}}\stackrel{{\scriptstyle P}}{{\longrightarrow}}1.
Together this leads to
[TABLE]
∎
Again in most cases the subsampling-type version of this estimator should be used.
Lemma A.23:
Let be
[TABLE]
Then it holds
[TABLE]
- Proof:
*By using the same steps as before it holds
\begin{array}[]{ll}{\mathbb{E}}\left({C_{6}^{\star}}(B)\right)&=\frac{1}{16B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left(\frac{\Lambda_{7}(\ell_{1,1},\dots,\ell_{8,a})}{6}-\frac{\Lambda_{8}(\ell_{1,1},\dots,\ell_{8,a})}{2}\right)\\[6.88889pt] &=\frac{1}{16B}\sum\limits_{b=1}^{B}{\mathbb{E}}\left(\left[{\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\right]^{2}\cdot\left(\frac{\left[{\boldsymbol{Z}_{(1,2)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(3,4)}\right]^{2}}{6}-\frac{\left[{\boldsymbol{Z}_{(5,6)}}^{\top}\boldsymbol{T}\boldsymbol{Z}_{(7,8)}\right]^{2}}{2}\right)\right)\\[6.88889pt] &\stackrel{{\scriptstyle\ref{MSchae3}}}{{=}}\frac{1}{16B}\sum\limits_{b=1}^{B}\operatorname{tr}\left(\left(2\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)=\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right).\end{array}
\begin{array}[]{l}\operatorname{{\it Var}}\left({\mathbb{E}}\left({C_{6}^{\star}}(B)|\mathcal{F}(\boldsymbol{\sigma}(B,8))\right)\right)=\operatorname{{\it Var}}\left(\operatorname{tr}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{4}\right)\right)=0.\end{array}
\begin{array}[]{ll}\operatorname{{\it Var}}\left({C_{6}^{\star}}(B)\right)&=0+{\mathbb{E}}\left(\operatorname{{\it Var}}\left({C_{6}^{\star}}(B)|\mathcal{F}(\boldsymbol{\sigma}(B,8))\right)\right)\\[2.15277pt] &\stackrel{{\scriptstyle\ref{Var1}}}{{\leq}}\frac{1}{16^{2}B^{2}}{\mathbb{E}}\left(\sum\limits_{(j,\ell)\in{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,\boldsymbol{\sigma}(b,8))}\operatorname{{\it Var}}\left(\frac{\Lambda_{7}(\boldsymbol{\sigma}(j,8))}{6}-\frac{\Lambda_{8}(\boldsymbol{\sigma}(j,8))}{2}\Big{\lvert}\mathcal{F}(\boldsymbol{\sigma}(B,8))\right)\right)\\[3.44444pt] \end{array}
**
\begin{array}[]{ll}{\color[rgb]{1,1,1}\operatorname{{\it Var}}\left({C_{6}^{\star}}(B)\right)}&=\frac{\operatorname{{\it Var}}\left(\frac{\Lambda_{7}(\ell_{1,1},\dots,\ell_{8,a})}{6}-\frac{\Lambda_{8}(\ell_{1,1},\dots,\ell_{8,a})}{2}\right)}{16^{2}B\cdot\left({\mathbb{E}}\left(|{\mathbb{N}}_{B}\times{\mathbb{N}}_{B}\setminus M(B,\boldsymbol{\sigma}(b,8))|\right)\right)^{-1}}\\[6.45831pt] &\stackrel{{\scriptstyle\ref{MSchae3}}}{{\leq}}\left(1-\left(1-\frac{1}{B}\right)\cdot\prod\limits_{i=1}^{a}\frac{\binom{n_{i}-8}{8}}{\binom{n_{i}}{8}}\right)\cdot\mathcal{O}\left(\operatorname{tr}^{4}\left(\left(\boldsymbol{T}\boldsymbol{V}_{N}\right)^{2}\right)\right).\par\end{array}**
∎
With A.19 we get an estimator for with and once more for a large number of groups should be used.
Lemma A.24:
Theorem 4.1* is also valid if is replaced by or by . Using or also doesn’t change the result. Identical the result of A.22 remains true if one or all estimators are replaced by their subsampling version.*
- Proof:
*With A.8 we know and so in both cases is asymptotically identic with .
From A.22 we know that converges in probability to zero so this result follows identically to Theorem 4.1. At last the subsampling versions have the same properties like the standard estimators. ∎
Therefore this is a second way to test the hypotheses and moreover, it provides an indicator for the choice of the limit distribution, because of A.8. For situation c) from Theorem 3.1 there is no proof that this approach can be used but in the case of just one group it leads to good results.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Ahmad, M. R., Werner, C. and Brunner, E. (2008). Analysis of High Dimensional Repeated Measures Designs: The One Sample Case. Computational Statistics and Data Analysis , 53 , 416–427.
- 2[2] Bai, Z. and Saranadasa, H. (1996) : Effect of highdimension: by an example of a two sample problem. Statistica Sinica 6, 311-329.
- 3[3] Bathke, A.C. and Harrar, S.W. (2008). Nonparametric methods in multivariate factorial designs for large number of factor levels. Journal of Statistical Planning and Inference , 138 ,588–610.
- 4[4] Bathke, A.C., Harrar, S.W. and Madden, L.V. (2008). How to compare small multivariate samples using nonparametric tests. Computational Statistics and Data Analysis , 52 , 4951–4965.
- 5[5] Billingsley, P. (1968) : Convergence of probability measures. John Wiley & Sons, New York.
- 6[6] Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems, I. Effect of inequality of variance in the one-way classification. The Annals of Mathematical Statistics , 25 , 290–302.
- 7[7] Brunner, E. (2009): Repeated measures under non-sphericity. Proceedings of the 6th St. Petersburg Workshop on Simulation.
- 8[8] Brunner, E., Becker, B. and Werner, C. (2010) : Approximate distributions of quadratic forms in high-dimensional repeated-measures designs. Technical Report, Department Medizinische Statistik Georg-August-Universität Göttingen
