Test for homogeneity with unordered paired observations
Jiahua Chen, Pengfei Li, Jing Qin, and Tao Yu

TL;DR
This paper develops likelihood ratio tests for homogeneity using unordered paired observations, relaxing previous assumptions and improving accuracy with Bartlett corrections, supported by simulations and real data.
Contribution
It introduces new likelihood ratio test procedures for unordered paired data that do not rely on variance or independence assumptions, with improved finite-sample accuracy.
Findings
Proposed likelihood ratio tests perform well under various scenarios.
Bartlett corrections improve test accuracy for small samples.
Methods are validated through simulations and real data examples.
Abstract
In some applications, an experimental unit is composed of two distinct but related subunits. The response from such a unit is but we observe only and , i.e., the subunit identities are not observed. We call unordered paired observations. Based on unordered paired observations , we are interested in whether the marginal distributions for and are identical. Testing methods are available in the literature under the assumptions that and . However, by extensive simulation studies, we observe that when one or both assumptions are violated, these methods have inflated type I errors or much lower powers. In this paper, we study the likelihood ratio test statistics for various scenarios and explore their limiting distributions…
| 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0.809 | 0.681 | 0.634 | 0.627 | 0.596 | 0.587 | 0.585 | 0.587 | 0.568 | 0.568 | |
| 1.312 | 1.150 | 1.092 | 1.070 | 1.046 | 1.028 | 1.030 | 1.032 | 1.016 | 1.012 | |
| 0.932 | 0.801 | 0.749 | 0.721 | 0.687 | 0.674 | 0.669 | 0.651 | 0.649 | 0.645 | |
| 1.417 | 1.194 | 1.129 | 1.090 | 1.062 | 1.040 | 1.038 | 1.028 | 1.022 | 1.018 |
| Levels | 10% | 5% | 1% | 10% | 5% | 1% |
|---|---|---|---|---|---|---|
| 13.7/10.7 | 7.3/5.7 | 1.8/1.4 | 11.3/ 9.9 | 5.9/5.1 | 1.3/1.2 | |
| 12.9/10.6 | 6.9/5.2 | 1.6/1.0 | 10.8/10.2 | 5.6/5.2 | 1.2/1.0 | |
| 15.9/10.5 | 8.1/5.5 | 1.8/1.1 | 13.4/10.4 | 7.0/5.5 | 1.5/1.1 | |
| 13.5/10.1 | 7.4/5.0 | 1.8/1.1 | 11.1/10.1 | 5.9/5.2 | 1.2/1.0 | |
| 1.2/0.8 | 0.5/0.3 | 0.1/0.1 | 0.1/0.0 | 0.0/0.0 | 0.0/0.0 | |
| 3.8/3.0 | 1.9/1.4 | 0.4/0.3 | 1.8/1.7 | 0.7/0.7 | 0.1/0.1 | |
| 15.9/10.5 | 8.1/5.5 | 1.8/1.1 | 13.4/10.4 | 7.0/5.5 | 1.5/1.1 | |
| 13.5/10.1 | 7.4/5.0 | 1.8/1.1 | 11.1/10.1 | 5.9/5.2 | 1.2/1.0 | |
| 0.0/0.0 | 0.0/0.0 | 0.0/0.0 | 0.0/0.0 | 0.0/0.0 | 0.0/0.0 | |
| 0.7/0.5 | 0.3/0.2 | 0.0/0.0 | 0.1/0.1 | 0.0/0.0 | 0.0/0.0 | |
| 15.9/10.5 | 8.1/5.5 | 1.8/1.1 | 13.4/10.4 | 7.0/5.5 | 1.5/1.1 | |
| 13.5/10.1 | 7.4/5.0 | 1.8/1.1 | 11.1/10.1 | 5.9/5.2 | 1.2/1.0 | |
| 53.7/47.2 | 38.6/33.0 | 15.2/12.7 | 83.1/80.9 | 71.6/69.1 | 43.6/41.3 | |
| 39.0/34.0 | 25.5/21.2 | 8.6/6.2 | 67.6/66.3 | 53.6/52.0 | 27.3/25.6 | |
| 15.9/10.5 | 8.1/5.5 | 1.8/1.1 | 13.4/10.4 | 7.0/5.5 | 1.5/1.1 | |
| 13.5/10.1 | 7.4/5.0 | 1.8/1.1 | 11.1/10.1 | 5.9/5.2 | 1.2/1.0 | |
| 92.6/89.9 | 84.5/80.5 | 57.5/52.4 | 100.0/99.9 | 99.9/99.8 | 98.5/98.3 | |
| 80.1/76.2 | 67.1/61.3 | 37.3/30.2 | 99.7/99.6 | 99.0/98.9 | 94.5/93.9 | |
| 15.9/10.5 | 8.1/5.5 | 1.8/1.1 | 13.4/10.4 | 7.0/5.5 | 1.5/1.1 | |
| 13.5/10.1 | 7.4/5.0 | 1.8/1.1 | 11.1/10.1 | 5.9/5.2 | 1.2/1.0 | |
| 1.0 | 1.0 | 28.1 | 18.3 | 8.3 | 6.3 | 57.6 | 41.8 | 11.2 | 8.0 |
|---|---|---|---|---|---|---|---|---|---|
| 1.0 | 1.5 | 67.0 | 49.7 | 19.2 | 11.3 | 97.5 | 93.0 | 40.2 | 24.8 |
| 0.5 | 1.0 | 46.9 | 85.2 | 12.3 | 70.5 | 88.2 | 99.9 | 21.7 | 99.6 |
| 0.5 | 1.5 | 92.2 | 99.2 | 39.2 | 90.6 | 100.0 | 100.0 | 79.7 | 100.0 |
| 1.0 | 1.0 | 7.2 | 6.2 | 10.4 | 7.2 | 6.7 | 6.0 | 16.7 | 10.5 |
| 1.0 | 1.5 | 38.8 | 27.0 | 29.6 | 17.5 | 70.9 | 56.9 | 63.9 | 44.8 |
| 0.5 | 1.0 | 22.4 | 77.3 | 16.4 | 78.2 | 43.2 | 99.8 | 32.5 | 99.9 |
| 0.5 | 1.5 | 80.9 | 98.5 | 54.0 | 95.5 | 99.7 | 100.0 | 93.5 | 100.0 |
| 1.0 | 1.0 | 1.0 | 1.9 | 15.8 | 9.8 | 0.1 | 1.0 | 32.8 | 20.0 |
| 1.0 | 1.5 | 17.7 | 13.1 | 54.7 | 34.6 | 22.4 | 16.6 | 93.7 | 83.2 |
| 0.5 | 1.0 | 8.4 | 71.8 | 24.3 | 91.3 | 7.6 | 99.6 | 53.6 | 100.0 |
| 0.5 | 1.5 | 66.0 | 98.1 | 76.4 | 99.5 | 95.7 | 100.0 | 99.5 | 100.0 |
| 1.0 | 1.0 | 65.1 | 45.6 | 7.3 | 5.9 | 97.7 | 93.1 | 9.0 | 6.8 |
| 1.0 | 1.5 | 90.0 | 76.1 | 14.2 | 9.0 | 100.0 | 99.9 | 27.1 | 16.5 |
| 0.5 | 1.0 | 75.7 | 92.1 | 10.2 | 68.3 | 99.7 | 100.0 | 16.6 | 99.5 |
| 0.5 | 1.5 | 97.9 | 99.7 | 29.5 | 87.8 | 100.0 | 100.0 | 64.5 | 100.0 |
| 1.0 | 1.0 | 93.8 | 81.0 | 6.7 | 5.7 | 100.0 | 100.0 | 8.1 | 6.4 |
| 1.0 | 1.5 | 99.0 | 94.3 | 11.3 | 7.9 | 100.0 | 100.0 | 19.8 | 12.2 |
| 0.5 | 1.0 | 94.9 | 97.8 | 9.0 | 73.9 | 100.0 | 100.0 | 13.3 | 99.8 |
| 0.5 | 1.5 | 99.7 | 100.0 | 23.3 | 90.6 | 100.0 | 100.0 | 50.3 | 100.0 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Optimal Experimental Design Methods · Statistical Methods and Inference
Test for homogeneity with unordered paired observations
Jiahua Chen, Pengfei Li, Jing Qin, and Tao Yu
Abstract
In some applications, an experimental unit is composed of two distinct but related subunits. The response from such a unit is but we observe only and , i.e., the subunit identities are not observed. We call unordered paired observations. Based on unordered paired observations , we are interested in whether the marginal distributions for and are identical. Testing methods are available in the literature under the assumptions that and . However, by extensive simulation studies, we observe that when one or both assumptions are violated, these methods have inflated type I errors or much lower powers. In this paper, we study the likelihood ratio test statistics for various scenarios and explore their limiting distributions without these restrictive assumptions. Furthermore, we develop Bartlett correction formulae for these statistics to enhance their precision when the sample size is not large. Simulation studies and real-data examples are used to illustrate the efficacy of the proposed methods.
1 Introduction
In some applications, an experimental unit is made of two distinct but related subunits. The response from such a unit is but we observe only and ; that is, the subunit identities are not observed or unobservable. We call unordered paired observations. We assume that , for , are independent and identically distributed (i.i.d.) normal random vectors:
[TABLE]
We say that are uncorrelated when and correlated when . This paper studies the homogeneity testing of the marginal distributions of and :
[TABLE]
Unordered paired data occur in many applications, and there is a long research history. For instance, Hinkley (1973) analyzed such a data set from human genetics. The genetic blueprint of an individual is contained in 23 pairs of chromosomes. Each member of the pair is inherited from the corresponding chromosome pair of a parent. If we do not know the chromosome correspondences between the offspring and the parents, we lose the parental identities and end up with unordered paired observations. Olkin and Viana (1995) provide more examples. In visual acuity studies, we may record only a subject’s extreme acuities (the “best” and “worst” acuities) without recording the corresponding eyes. In twin experiments, we obtain unordered paired observations without a label for each member of a twin pair; see Ernst et al. (1996) and Shekar et al. (2006) and the references therein. Furthermore, unordered data of a higher dimension are collected in various scientific disciplines. For example, Davies and Phillips (1988) provided an example of unordered data of dimension . In the interim analysis of a double-blinded clinical trial of treatments, we get the order statistics without knowledge of the corresponding treatments; see also van der Meulen (2005) and Miller et al. (2009). In diffusion tensor (DT) brain imaging (see Yu et al. (2013) and the references therein), the eigenvalues of the DT estimates for each brain voxel are viewed as unordered triples.
With unordered paired observations, a fundamental question is whether or not and have the same distribution. Under Model (1), this is equivalent to testing the hypothesis specified in (2). Hinkley (1973) proposed a likelihood ratio test (LRT) procedure under the assumption that and . Li and Qin (2011) investigated this problem in a semiparametric setup. Other approaches can be found in Moore II (1973), Lauder (1977), Moore II et al. (1979), Carothers (1981), Efron et al. (1971), and Qin and Zhang (2005), among others. All these works assume that and are independent with equal variance. These assumptions may not hold in applications, and they can be severely violated, as evidenced by the examples in Section 5. Ignoring the dependence structure and/or imposing an incorrect equal-variance assumption can lead to unreliable inference conclusions: the type I error may be severely inflated or the power markedly decreased.
This paper focuses on tests for (2). In particular, we study the LRT in four scenarios: (1) and ; (2) ; (3) ; and (4) no assumption on , , and .
Investigating the asymptotic behavior of these LRT statistics is technically challenging. The well-developed theory (Wilks, 1938; Chernoff, 1954; Self and Liang, 1987; Drton, 2009) is not applicable because of the undesirable mathematical properties (see (5) in Section 2) of the log-likelihood function. In addition, an important byproduct of the theory for the corresponding LRT statistics is the asymptotic behavior of the maximum likelihood estimators (MLEs) for . Interestingly, we have shown that the asymptotic behavior depends on whether is known or is unknown. The convergence rates of these parameter estimates depend on the scenario.
We observe that the limiting distributions of the LRT statistics under are not sufficiently accurate approximations to their finite-sample distributions when is not large. To enhance the approximation precision of the limiting distributions, we adjust the statistics based on the Bartlett correction (Bartlett, 1937; Lawley, 1956). Simulation results confirm the efficacy of the adjustment.
We organize the rest of the paper as follows. Section 2 introduces the LRT statistics for (2) and studies their asymptotic behavior under . Section 3 presents the adjusted limiting distributions of our statistics for data of limited sample size. Section 4 contains simulation studies, and Section 5 gives real-data examples. The technical details are relegated to Section 6.
2 Main Results
The LRT is an essential tool in statistical inference, especially under the parametric model assumption; see Wilks (1938); Chernoff (1954); Self and Liang (1987); Drton (2009), and the references therein. In this section, we present LRT statistics and study their properties for testing (2) under model assumptions on and whether or not .
We first derive the log-likelihood function with unordered paired observations. For any , we have
[TABLE]
Therefore, the joint density function of is given by
[TABLE]
where \phi(x_{1},x_{2};\mbox{\boldmath\theta}) denotes the bivariate normal density function with parameters \mbox{\boldmath\theta}=(\mu_{1},\mu_{2},\sigma_{1},\sigma_{2},\rho)^{\tau} specified in (1). The log-likelihood function based on and Model (1) is:
[TABLE]
This likelihood function is the basis for our subsequent development.
2.1 Unordered uncorrelated paired data
In this section, we assume that is known; problem (2) is reduced to . We define
[TABLE]
and we use the notational convention that the entries of \hat{}\mbox{\boldmath\theta} are , , and so on. Note that \hat{}\mbox{\boldmath\theta}, \tilde{}\mbox{\boldmath\theta}, and \check{}\mbox{\boldmath\theta} are MLEs of under various constraints. The LRT statistics for testing the null hypothesis (2) against two alternatives, specified by and respectively, are given by
[TABLE]
Theorem 1 below establishes the asymptotic distributions of and as well as the convergence rates of \tilde{}\mbox{\boldmath\theta} and \hat{}\mbox{\boldmath\theta} under . For presentational continuity, we relegate its proof to Section 6. Let denote “convergence in distribution.” We use for an equal mixture of and , with being the distribution with a point mass at zero.
Theorem 1**.**
Assume Model (1) and . Under , as , we have
- (a)
, and are all of order , and
[TABLE] 2. (b)
, for are all of order , and
[TABLE]
where and with being three i.i.d. random variables.
Deriving the asymptotic null distributions of and is technically challenging. We make the following comments. Let and so that and ; we have
[TABLE]
This fact implies that the Fisher information matrix of under the null hypothesis degenerates and undermines the basis for the elegant classical results (Wilks, 1938; Chernoff, 1954; Self and Liang, 1987; Drton, 2009). The crucial step in obtaining the asymptotic null distribution of the LRT is a quadratic approximation in \hat{}\mbox{\boldmath\theta}-\mbox{\boldmath\theta} to the log-likelihood ratio function. Following this path, we need to consider a fourth-order Taylor expansion to obtain a quadratic approximation in (\hat{}\mbox{\boldmath\theta}-\mbox{\boldmath\theta})^{2} and so on. Fortunately, we find that the sandwich technique of Chen and Chen (2001) and Chen et al. (2001) overcomes the technical obstacles caused by (5).
2.2 Unordered correlated pair data
In this section, we study the LRTs for (2) with being an unknown parameter. Define
[TABLE]
Similarly to the strategy for (4), we define the LRT statistics for (2) with being an unknown parameter:
[TABLE]
Theorem 2 below establishes the asymptotic distributions of and as well as the convergence rates of \tilde{}\mbox{\boldmath\theta}^{*} and \hat{}\mbox{\boldmath\theta}^{*} under their respective . The proof is given in Section 6.
Theorem 2**.**
Assume Model (1) but do not assume . Under , as , we have
- (a)
, , , and are all of order , and
[TABLE] 2. (b)
, , , , and are all of order , and
[TABLE]
where , , and are three i.i.d. random variables.
The limiting cumulative distribution function (c.d.f.) of is given by:
[TABLE]
for with being the c.d.f. of the standard normal distribution. We use this expression to evaluate the asymptotic quantile and the p-value for the corresponding test.
3 Adjusted Limiting Distributions
One drawback of the general asymptotic results is that they may offer poor approximations to the corresponding finite-sample distributions. The convergence rates of the parameter estimators given in Theorems 1 and 2 are much lower than those of the MLEs from the regular parametric models. This adversely affects the approximation accuracy of the asymptotic distributions to the finite-sample distributions of the LRT statistics. To improve the approximation precision when is not very large, we use the Bartlett correction. Suppose the limiting distribution of a statistic is given by . We may search for a sequence of c.d.f.s such that and have the same first moment up to order . This idea was pioneered by Bartlett (1937) and generalized by Lawley (1956).
In this spirit, we search for accurate approximate distributions for , , , and as follows. Recall that and are the limiting distributions of and . Let
[TABLE]
We need to find , , , and so that the above distributions have first moments very close to the first moments of their corresponding test statistics for a wide range of values. High-order asymptotic techniques can be used, but they may involve complicated analytical tools with little assurance of the quality of the end products. The computer experiment approach of Chen and Li (2011) is more effective and practical, and it matches the spirit of the data science.
The experiment works as follows. We consider a sufficiently wide range of values for . For each , we simulate a large number of data sets, with each data set composed of i.i.d. unordered paired observations. Due to the invariance property of the LRT statistics, each data set is generated from the standard bivariate normal distribution. Based on these data sets, we obtain the simulated first moments of , , , and . We choose so that the simulated first moment of matches the first moment of . We then look for a regression model for versus . Similar procedures are applied to obtain regression models for , , and .
Specifically, let us take for ease of illustration:
- Step 1. For every in , generate data sets of size .
- Step 2. Obtain values of and therefore its simulated first moment, denoted . Match with the first moment of to find .
- Step 3. Fit a regression model to with being the response and being the covariate.
We postulate the following nonlinear but parametric regression models:
[TABLE]
with and being regression parameters, and accounting for imperfect fit. Applying Steps 1–2 outlined above leads to the , , , and values in Table 1. Fitting the nonlinear regression models (6)–(9) to the data in Table 1 gives us the fitted values of and . With these values, we calculate the approximate p-values with the following adjusted limiting distributions:
[TABLE]
We have implemented the four LRT statistics with the proposed adjusting limiting distributions in an R package; it is available upon request.
4 Simulation Studies
4.1 Data generation
Because of the invariance property, we need only study the LRT tests based on data generated from distributions with standardized parameter values.
To examine the sizes of the tests, we simulate at and in (1). We study five cases corresponding to , and . To compare the powers of the tests, we set , , and form 20 cases as combinations of , and .
In each case, we generate from model (1) with one of the above parameter settings. Then, we obtain and . We repeat the process to obtain unordered pairs .
Based on each set of unordered pairs, we compute the values of , , , and and carry out the tests for without checking that the model for generating the data satisfies the conditions for the tests. We record the rejection rates based on repetitions; the results are presented in the next section.
4.2 Results
We calculate the rejection rate of each test at the significance levels , and . The rejection percentages under the null models are summarized in Table 2.
When , and are simulated to be independent. The assumptions for all the LRTs, , , , and , are satisfied. However, as shown in the first section of Table 2, if their limiting distributions are applied without adjustment, the resulting tests are inaccurate: their type I errors markedly exceed the nominal significance levels. The adjustment proposed in Section 3 is very helpful. After the adjustment, the type I errors of all the tests are close to the nominal levels. The precision is impressive since the adjustment works well even when is as small as .
When or , the model assumptions for and are violated. When we apply the tests, the type I errors are either near zero when or or seriously inflated when or . In contrast, because of their invariance property, and continue to perform well: with their limiting distributions adjusted, they have satisfactory precision in the type I errors.
To further illustrate the effects of the adjustment on the limiting distributions, Figure 1 presents the type I errors (%) of our LRTs at the 5% significance level when and . The trends for the 10% and 1% significance levels are similar and are omitted. The plots show that the type I errors of , after the adjustment are within a band of the nominal level for large and a band otherwise; similar results are observed for . For , the approximation accuracy shows no clear improvement as increases, but the type I errors are between 5% and 5.4%, which is sufficiently accurate for typical applications.
Next, we compare the powers of , , , and under the alternatives. All combinations of , , , and are incorporated, as described in Section 4.1. Their powers, summarized in Table 3, are computed at the 5% significance level based on the adjusted limiting distributions. We observe that when , and have higher powers than and ; when , and have higher powers in most cases; when is increased to 0.5, and are much more powerful; when and , and are more powerful, but at the cost of the inflated type I errors reported in Table 2; a test with a markedly inflated type I error is generally not recommended.
5 Real-Data Examples
5.1 Data from karyotype analysis
This example considers 40 unordered pairs of the lengths of the longer and shorter arms of chromosome II of Larix decidua from 40 specimens; so . The data are available in Table 1 of Matérn and Simak (1968). The test results from , , , and for (2) are as follows:
- •
and . Calibrated by the adjusted limiting distributions, the asymptotic -values of and are and .
- •
and . Calibrated by the adjusted limiting distributions, the asymptotic -values of and are and .
The maximum likelihood estimate of is found to be
[TABLE]
Note that suggests strong negative correlation between and . As revealed in the simulation studies reported in the bottom section of Table 2, and are therefore not reliable because they are designed for . Moreover, the fitted values and are very close, but and are significantly different. Hence, is unsuitable because it is designed for the case where . We recommend , which is designed to detect departures from either equal-mean or equal-variance hypotheses.
5.2 C-band area of human chromosome data
This example consists of normalized measurements of the C-band area on the No. 9 chromosome pair (Mason et al., 1975). The measurements are based on three groups: the father, mother, and offspring. These groups respectively have 40, 18, and 31 unordered pairs of normalized measurements of the C-band area. The data are available in Table 1 of Lauder (1977). We analyze the group of fathers as an example; the analysis of the other groups is similar. We constructed , , , and and the corresponding -values from the adjusted limiting distributions. The results are as follows:
- •
and with . Calibrated by the adjusted limiting distributions, the asymptotic -values of and are and .
- •
and with . Calibrated by the adjusted limiting distributions, the asymptotic -values of and are 7.5 and .
The maximum likelihood estimate of is found to be
[TABLE]
Note that suggests strong postive correlation between and . Moreover, and are quite different whereas . These suggest that is the most suitable test while is also a possibility. Note that is sharper than with a smaller p-value.
6 Technical Details
6.1 Reparameterization and preparation lemmas
Recall that is the unordered pair of and the latter has a bivariate normal distribution with parameter vector \mbox{\boldmath\theta}=(\mu_{1},\mu_{2},\sigma_{1},\sigma_{2},\rho)^{\tau}. The log-likelihood function based on is
[TABLE]
Let and . We introduce notation for the following quantities:
[TABLE]
Further, let and
[TABLE]
Note that we use to denote the density function of , matching \phi(x_{1},x_{2};\mbox{\boldmath\theta}) for the bivariate normal distribution.
With these, we obtain the following decomposition of the likelihood function:
[TABLE]
We use a generic for the parameters, which may be interpreted as \mbox{\boldmath\theta}=(\mu,\sigma_{+},\beta_{0},\beta_{1},\eta)^{\tau} when necessary.
Under in Theorem 1 which includes the assumption that , suppose the true parameter values of the data-generating distribution are , . We may then, in our proofs, work with the transformed data
[TABLE]
After the transformation, the algebraic form of the likelihood does not change but the true parameter values of the data-generating distribution become and . Without loss of generality, based on the above invariance property, we may assume that the true parameters and under .
Under in Theorem 2, without loss of generality, the same assumption is applicable to and . We now reveal that by the same invariance principle we may also assume as long as the true value . When , we simply let
[TABLE]
The distribution-generated data now has the true parameter values , , and under .
With the above standardization operation, for both Theorems 1 and 2, we study the asymptotic null properties under the assumption that and are independent normal random variables with the standard parameter values:
[TABLE]
We first establish three preparatory lemmas.
Lemma 1**.**
As , we have, almost surely,
[TABLE]
where \mbox{\mathbbm{1}}(\cdot) is the indicator function.
Proof.
Note that
[TABLE]
is the empirical measure of the two-dimensional stripe formed by the inequality
[TABLE]
This class of stripes can divide points in two-dimensional space into at most a polynomial number of different subsets. By Pollard (1990), this property implies the uniform strong law of large numbers:
[TABLE]
almost surely.
The distribution of is normal with variance at least 1. Based on this, we have for any . Hence, almost surely,
[TABLE]
This completes the proof. ∎
Lemma 2**.**
Suppose an estimator \bar{}\mbox{\boldmath\theta} satisfies
[TABLE]
for some constant . Then under the null model, \bar{}\mbox{\boldmath\theta}=\mbox{\boldmath\theta}_{0}+o_{p}(1)=(0,1,0,0,1)^{\tau}+o_{p}(1).
Proof.
Note that we have decomposed \ell_{n}(\bar{}\mbox{\boldmath\theta})-\ell_{n}(\mbox{\boldmath\theta}_{0}) into a sum of two terms. For the first term, according to the classical result about the LRT under regular models, it is clear that
[TABLE]
When in the second term the variance parameter , we have
[TABLE]
By the law of large numbers, we have
[TABLE]
almost surely. This implies that
[TABLE]
and subsequently, uniformly for in this range,
[TABLE]
Together with (12), we have, whenever ,
[TABLE]
in probability. Since the lemma condition clearly states that does not have the above property, it cannot be in this range. That is, we conclude that .
Suppose and is a very small positive value. In this case, for all , we have
[TABLE]
For such that
[TABLE]
we have
[TABLE]
By Lemma 1, uniformly in and and almost surely, at least of the ’s satisfy (13). Therefore,
[TABLE]
as and . Namely, for all sufficiently small, we also have
[TABLE]
In conclusion, the value satisfying the lemma condition must almost surely fall within the interval for some sufficiently small and sufficiently large .
Within the parameter space , the density function
[TABLE]
satisfies the conditions for the consistency of the MLE specified in Wald (1949). For instance, it is a continuous density function with its limit being 0 whenever or goes to infinity. For a sufficiently small , let
[TABLE]
be a ball centered at the true value. The side conclusion as stated in Wald (1949) is
[TABLE]
for some . Again, by the lemma condition on \bar{\mbox{\boldmath\theta}}, we must have within of the true parameter value for any as . This proves part of the lemma.
It is now apparent that we also have
[TABLE]
By the same argument based on the assumed property of \bar{}\mbox{\boldmath\theta}, we must have
[TABLE]
This is sufficient for the proof of the consistency of . Combined with the proof of the other parts, this completes the proof of the lemma. ∎
Next, we strengthen the results of Lemma 2. We first define some notation for the next lemma. Let
[TABLE]
It can be seen that , , and are uncorrelated, and
[TABLE]
Further, we introduce two parameter vectors of lengths 2 and 4:
[TABLE]
In the following, we use and to denote the and norms of the vector , respectively.
Lemma 3**.**
Under the conditions of Lemma 2 and the null hypothesis, we have
[TABLE]
Proof.
We first prove (a). By Lemma 2, we have . We obtain (a) by expanding at to the second order and then assessing the asymptotic orders via the weak law of large numbers.
To prove (b), we first denote
[TABLE]
and then write
[TABLE]
Applying the inequality , we have
[TABLE]
Next, we delineate given as proved in Lemma 2. We perform two main steps. In the first step, we obtain the fourth-order Taylor expansion of ; in the second step, we assess the asymptotic orders of the terms in the expansion and put them into appropriate order expressions.
We start with the first step. Let the partial derivatives be
[TABLE]
Expanding both to the fourth order at , we get
[TABLE]
where the summation is over all non-negative integer combinations of summing to and is the remainder term in the Taylor expansion. Let , then
[TABLE]
In the second step, we first show that every term in the summation part of (16) satisfying is of order . For instance, when , we have
[TABLE]
helped by the fact that we are investigating the region of . For notational simplicity, let . It is easy to check that has zero mean and finite variance, so
[TABLE]
Therefore, we have
[TABLE]
The proofs for the other terms are similar. Hence, we may write
[TABLE]
and still have
[TABLE]
By straightforward algebra, we find
[TABLE]
where the unwanted term is the fourth element of vector . Its coefficient is easily verified to be . This allows us to obtain a neater expression by absorbing it into the higher-order term, concluding that
[TABLE]
such that
[TABLE]
In short, we have shown that
[TABLE]
The above algebraic manipulations are typical of the techniques employed in Chen and Chen (2001) and Chen et al. (2001). The same techniques, which are tedious but not sophisticated, give
[TABLE]
Together with the weak law of large numbers these lead to
[TABLE]
Combining (22)–(24) with (15), we have
[TABLE]
Recall that , so the above upper bound is applicable to . This completes the proof of (b).
Finally, we come to (c). Combining (a) and (b) and the conditions in Lemma 2, we have
[TABLE]
which is possible only if both and . This leads to the order assessments in (c) and completes the proof of the entire lemma. ∎
6.2 Proof of Theorem 1
The difference between Theorems 1 and 2 is that in the former we consider to be known when formulating the test statistic. This makes it helpful to reorganize the entries of and and the corresponding entries of and .
When is known, we have . Let
[TABLE]
Every entry of and is a linear combination of the entries of t, possibly with an difference when these parameter values approach their default null values. We enumerate these entries as follows. The first entry of is , and the second is . For the entries of , we have
[TABLE]
For the others, , , and .
Because every entry of and is virtually a linear combination of the entries of t, we can reorganize the entries of and into a vector such that
[TABLE]
Naturally, we have and some algebra shows that . The following result is immediate.
Lemma 4**.**
Assume the conditions of Lemma 3 and let . If, under the null model,
[TABLE]
we then have
[TABLE]
We are now ready for Theorem 1. The order conclusions of the MLEs in both Theorem 1(a) and 1(b) have been established in Lemma 4. We now derive the limiting distributions.
We rewrite defined in (4) as
[TABLE]
with \check{}\mbox{\boldmath\theta} being the maximum point of the reduced model where . Since the reduced model is regular, by standard techniques such as those in Serfling (2000):
[TABLE]
where denote the first two entries of vector .
Next, note that \tilde{}\mbox{\boldmath\theta} is the maximum point of the reduced model where . This makes and subsequently for t under the reduced model,
[TABLE]
Nevertheless, Lemma 4 is applicable to the above form of t as long as it is close to its counterpart in the null model. Hence,
[TABLE]
Note the range of the supremum conforms to the form of t in the reduced model and the fact that . The specific coefficient values are due to the value of .
The upper bound in (27) is attained if we put
[TABLE]
With some straightforward algebra, the corresponding values of t exist and satisfy
[TABLE]
Applying the Taylor expansion, with being the above , we get
[TABLE]
Since \tilde{}\mbox{\boldmath\theta} is the maximum point of \ell_{n}(\mbox{\boldmath\theta}), 2\{\ell_{n}(\tilde{}\mbox{\boldmath\theta})-\ell_{n}(\mbox{\boldmath\theta}_{0})\} is not smaller than the value in (27). The sandwich technique of Chen and Chen (2001) and Chen et al. (2001) or the squeeze theorem can be applied to obtain
[TABLE]
[TABLE]
which has the limiting distribution . This completes the proof of part (a).
We now prove conclusion (b). In this case, the range of t has only an intrinsic restriction as seen in the expression
[TABLE]
Let and . It can be seen that lies on a two-dimensional manifold. Nonetheless, the upper bound developed in Lemma 4 remains valid. We partition into and with covariance matrices and . With these preparations, we have
[TABLE]
The supremum is taken over with the intrinsic restriction respected. Similarly to (30), the upper bound (31) is attained at some feasible parameter value. Hence,
[TABLE]
Combining (26) and (32), we get
[TABLE]
The intrinsic restriction due to the specific form of leads to the nonstandard form of the limiting distribution in the theorem.
6.3 Proof of Theorem 2
The test problem in Theorem 2 is different from that of Theorem 1 because we do not assume knowledge of the value. The parameter vector is now \mbox{\boldmath\theta}=(\mu_{1},\mu_{2},\sigma_{1},\sigma_{2},\rho)^{\tau} including the correlation coefficient . Because of the invariance argument, we need consider only the case where \mbox{\boldmath\theta}_{0}=(0,0,1,1,0)^{\tau} under the null hypothesis for the asymptotic properties in this theorem.
With the introduction of , it helps to redefine , , and so on as follows:
[TABLE]
and the corresponding , as
[TABLE]
These are almost the quantities with the same names defined above Lemma 3. The difference is that the first entry of is now the third entry of . That is, we partition the vector differently here.
When in Theorem 2, the asymptotic expansion of the likelihood ratio is an expansion for regular models:
[TABLE]
The result of Lemma 3 remains applicable:
[TABLE]
Since in Theorem 2(a), we have
[TABLE]
This leads to
[TABLE]
where we have instead of because of the intrinsic constraint . We skip the step of showing that the above upper bound is attainable, since this is now routine.
[TABLE]
which converges to in distribution, which is conclusion (a).
For in (b), we are not helped by . Yet
[TABLE]
remains true for in a small neighborhood of \mbox{\boldmath\theta}_{0}. Similarly, we still have
[TABLE]
We skip the proof that this upper bound is attained. Hence,
[TABLE]
The challenge is to provide an analytical description of the limiting distribution when
[TABLE]
For this purpose, we highlight the fact that is asymptotically multivariate normal with mean 0 and covariance matrix . The supremum is hence attained in the range of . In the subregion where , we have . Hence,
[TABLE]
In the other subregion where , combined with the restriction , we must have . Consequently, in this region, . This leads to
[TABLE]
Hence,
[TABLE]
[TABLE]
Therefore, has the limiting distribution as claimed.
Acknowledgements
The research is supported in part by NSERC Grants RGPIN-2014-03743 and RGPIN-2015-06592 and Singapore Ministry Education Academic Research Fund Tier 1 and the Ministry of Education of Singapore: MOE2014-T2-1- 072.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1)
- 2Bartlett (1937) Bartlett, M. S. (1937), ‘Properties of sufficiency and statistical tests’, Proceedings of The Royal Society A 160 , 268–282.
- 3Carothers (1981) Carothers, A. D. (1981), ‘On determining the parental origins of homologous chromosomes’, Annals of Human Genetics 45 , 367–374.
- 4Chen and Chen (2001) Chen, H. and Chen, J. (2001), ‘The likelihood ratio test for homogeneity in finite mixture models’, The Canadian Journal of Statistics 29 , 201–215.
- 5Chen et al. (2001) Chen, H., Chen, J. and Kalbfleisch, J. D. (2001), ‘A modified likelihood ratio test for homogeneity in finite mixture models’, Journal of the Royal Statistical Society: Series B 63 , 19–29.
- 6Chen and Li (2011) Chen, J. and Li, P. (2011), ‘Tuning the EM-test for finite mixture models’, Canadian Journal of Statistics 39 (3), 389–404.
- 7Chernoff (1954) Chernoff, H. (1954), ‘On the distribution of the likelihood ratio’, The Annals of Mathematical Statistics 25 , 573–578.
- 8Davies and Phillips (1988) Davies, P. and Phillips, A. J. (1988), ‘Nonparametric tests of population differences and estimation of the probability of misidentification with unidentified paired data’, Biometrika 75 , 753–760.
