Causal Inference from Possibly Unbalanced Split-Plot Designs: A Randomization-based Perspective
Rahul Mukerjee, Tirthankar Dasgupta

TL;DR
This paper develops new methods for causal inference in unbalanced split-plot designs, providing unbiased variance estimators and a construction procedure to improve inference accuracy in complex experimental setups.
Contribution
It extends randomization-based causal inference methods to unbalanced split-plot designs, introducing a new unbiased variance estimator and a minimax bias construction procedure.
Findings
Derived a sampling variance expression for treatment contrasts.
Proposed a new unbiased variance estimator under milder conditions.
Introduced a minimax bias construction procedure.
Abstract
Split-plot designs find wide applicability in multifactor experiments with randomization restrictions. Practical considerations often warrant the use of unbalanced designs. This paper investigates randomization based causal inference in split-plot designs that are possibly unbalanced. Extension of ideas from the recently studied balanced case yields an expression for the sampling variance of a treatment contrast estimator as well as a conservative estimator of the sampling variance. However, the bias of this variance estimator does not vanish even when the treatment effects are strictly additive. A careful and involved matrix analysis is employed to overcome this difficulty, resulting in a new variance estimator, which becomes unbiased under milder conditions. A construction procedure that generates such an estimator with minimax bias is proposed.
| Population | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| I | (10,5,9,8) | (10,5,9,8) | (10,5,9,8) | (10,5,9,8) | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 |
| II | (10,5,9,8) | (9,7,4,6) | (11,8,7,8) | (8,7,6,9) | 2.5 | 2 | 2 | 3 | .5 | .5 | .5 | .5 |
| III | (10,5,9,8) | (5,9,10,8) | (10,9,8,5) | (10,5,8,9) | 2.5 | 2 | 2 | 3 | 1 | 1 | 1 | 1 |
| IV | (10,5,9,8) | (5,9,10,8) | (10,9,8,5) | (10,5,8,9) | 2.5 | 2 | 2 | 3 | .5 | .5 | .5 | .5 |
| V | (10,5,9,8) | (5,9,10,8) | (10,9,8,5) | (10,5,8,9) | 2.5 | 2 | 2 | 3 | .2 | .4 | .6 | .8 |
| VI | (10,5,9,8) | (5,9,10,8) | (10,9,8,5) | (10,5,8,9) | 2.5 | 2 | 2 | 3 | 0 | 0 | 0 | 0 |
| VII | (10,5,9,8) | (5,9,10,8) | (10,9,8,5) | (10,5,8,9) | 2.5 | 2 | 2 | 3 | -.3 | -.3 | -.3 | -.3 |
| VIII | (10,5,9,8) | (5,9,10,8) | (10,9,8,5) | (10,5,8,9) | 2.5 | 2 | 2 | 3 | -.3 | .3 | -.3 | .3 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Optimal Experimental Design Methods · Statistical Methods and Bayesian Inference
Causal Inference from Possibly Unbalanced Split-Plot Designs: A Randomization-based Perspective
Rahul Mukerjee
Indian Institute of Management Calcutta, Joka, Diamond Harbour Road, Kolkata 700104, India, email: [email protected]
Tirthankar Dasgupta
Abstract
Split-plot designs find wide applicability in multifactor experiments with randomization restrictions. Practical considerations often warrant the use of unbalanced designs. This paper investigates randomization based causal inference in split-plot designs that are possibly unbalanced. Extension of ideas from the recently studied balanced case yields an expression for the sampling variance of a treatment contrast estimator as well as a conservative estimator of the sampling variance. However, the bias of this variance estimator does not vanish even when the treatment effects are strictly additive. A careful and involved matrix analysis is employed to overcome this difficulty, resulting in a new variance estimator, which becomes unbiased under milder conditions. A construction procedure that generates such an estimator with minimax bias is proposed.
Keywords: Bias; Factorial experiment; Finite population; Minimaxity; Treatment-effect additivity.
Introduction
Factorial experiments were originally developed in the context of agricultural experiments (Fisher, 1925, 1935; Yates, 1935) and later extensively used in industrial and engineering applications (Wu and Hamada, 2009). Such experiments have currently been undergoing a third popularity surge among social, behavioral, and biomedical sciences. However, one of the key challenges of using standard principles of designing and analyzing factorial experiments in these fields arises from randomization restrictions. Consider a simplified version of the education experiment described in Dasgupta et al., (2015). Suppose the goal is to assess the causal effects of two interventions (referred to as factors in experimental design literature) – : a mid-year quality review by a team of experts, and : a bonus scheme for teachers – on the performances of 40 schools in the state of New York. Each factor has two levels denoted by 1 (application) and 0 (non-application). A completely randomized assignment of the 40 schools to the four treatment combinations is likely to disperse the schools assigned to level 1 of factor (i.e., schools to undergo review) all over the state. Such a design may be prohibitive from the consideration of travel cost and time. A more practical alternative would be to divide these 40 schools by geographic proximity into four groups called whole-plots. Two of these whole-plots would then be randomly assigned to level 0 and the other two to level 1 of factor . The teacher bonus scheme can then be applied to half of the schools chosen randomly within each whole-plot. Such a randomization scheme is an example of a classic split-plot design. See Kirk, (1982), Cochran and Cox, (1957), Box et al., (2005), and Wu and Hamada, (2009) for formal definitions.
Randomization-based inference is the most natural methodology to draw inference on causal effects of treatments from split-plot experiments in a finite population setting, as observed by Freedman, (2006, 2008). Recently, Zhao et al., (2018) developed a framework for randomization-based estimation procedure of finite-population causal effects for balanced split-plot designs, in which each whole-plot consists of the same number of units or sub-plots, and any treatment combination of the sub-plot factors occurs equally often in all whole-plots; vide (4) below. However, unbalanced split-plot designs are quite common in the social sciences. Consider the school experiment described earlier. Suppose the 40 schools are spread over four counties with 8, 8, 12 and 12 schools in these counties. In this case, each county can be considered as a natural whole-plot. Thus the design is unbalanced and the estimation methodology proposed by Zhao et al., (2018) is no longer applicable.
In this paper we investigate randomization based causal inference in split-plot designs that are possibly unbalanced, using the potential outcomes framework (Neyman, 1923; Rubin, 1974, 1978, 2005). We start with a natural unbiased estimator of a typical treatment contrast and first examine how far the approach of Zhao et al., (2018) for the balanced case can be adapted to our more general setup. It is seen that this approach, aided by a variable transformation, yields an expression for the sampling variance of the treatment contrast estimator but runs into difficulty in variance estimation. Specifically, as in the balanced case and other situations in causal inference, the resulting variance estimator is conservative in the sense of having a nonnegative bias. However, unlike in most standard situations, the bias does not vanish even under strict additivity or homegeneity of treatment effects. To overcome this problem, a careful matrix analysis is employed leading, under wide generality, to a new variance estimator. This estimator is also conservative, but enjoys the nice property of becoming unbiased under between-whole-plot additivity, a condition even milder than strict additivity. We also discuss the issue of minimaxity, with a view to controlling the bias in variance estimation and explore the bias of the estimator under treatment effect heterogeneity via simulations.
Treatment contrast and its unbiased estimation
Consider a factorial experiment conducted to assess causal effects of whole-plot factors and sub-plot factors on a finite population of units. Each factor has two or more levels. The treatment combinations are denoted by , where and is the set of level combinations of . For , let denote the potential outcome of unit when exposed to treatment combination . A typical treatment contrast for unit of the form
[TABLE]
where , are known, not all zeros, and sum to zero. Let
[TABLE]
denote the average potential outcome for treatment combination , and let
[TABLE]
denote a treatment contrast for the finite population of units. We define as the finite-population causal estimand of interest and consider the problem of drawing inference on using the outcomes observed from the experiment.
The observed outcomes are generated through an assignment mechanism, which is the process of allocating treatment combinations to the units. Here we consider a split-plot assignment mechanism which can be described as follows. Suppose there is a partitioning of the experimental units into disjoint sets , called whole-plots, such that consists of units, called sub-plots, , and . Consider now a two-stage randomization, which assigns whole-plots to level combination of and then, for each , assigns sub-plots within whole-plot to level combination of . Here at each stage all assignments are equiprobable, the and are fixed positive integers, and , for .
Note that the above assignment mechanism yields a balanced split-plot design if
[TABLE]
In the school example described in Section 1, the whole-plots represent sets of schools within a county and we have , , , , . Finally, for all , for and for . Thus, the design is unbalanced.
To define the observed outcomes of the experiment, we introduce two sets of random treatment assignment indices at the whole-plot and the sub-plot levels. Let denote the set of indices such that whole-plot is randomly assigned to level combination of . Similarly, for and , let be the set of sub-plots in randomly assigned to level combination of . For any treatment combination , the observed outcomes from the whole-plot , , are then , . Let
[TABLE]
denote the average observed outcome for treatment combination within whole-plot for . In the spirit of the usual unbiased estimator of the population mean in two-stage sampling (Cochran, 1977), define
[TABLE]
where is the average whole-plot size. From (5) and (6), it is straightforward to verify by conditioning on the randomization at the whole-plot level that , where is given by (2). Using (3), an immediate consequence of this fact is Proposition 1.
Proposition 1**.**
An unbiased estimator of the finite population treatment contrast is given by
[TABLE]
where is given by (6).
Sampling variance and its estimation generalizing the balanced case
Proposition 1 yields a point estimator of . However, to quantify the uncertainty associated with the point estimator and draw inference on , one needs to derive and estimate the sampling variance of with respect to its distribution induced by the randomization in the split-plot design. Zhao et al., (2018) derived an expression for the sampling variance of for a balanced split-plot design, that is, when conditions (4) are satisfied. They also obtained an estimator of the sampling variance that, like most variance estimators in finite population causal inference (Mukerjee et al., 2018), has a nonnegative bias. Further, they noted that this bias vanishes under between-whole-plot additivity, that is, average treatment effect homogeneity at the whole-plot level. In this Section, we derive an expression for the sampling variance and find a variance estimator generalizing the arguments in Zhao et al., (2018) to the unbalanced case, and examine the properties of the estimator. To that end, we first convert the “raw” potential outcomes to “adjusted” potential outcomes
[TABLE]
for each , , and . An intuition behind this adjustment will be provided shortly, after we introduce its observed version.
For each , define , and . By (8), . Next, for and , define
[TABLE]
In the balanced case, and represent, respectively, the between and within whole-plot mean squares or products in an analysis of variance/covariance decomposition of the potential outcomes.
It is also important to define a measure of heterogeneity of treatment contrasts across the whole-plots. First, Let
[TABLE]
denote the whole-plot level treatment contrasts, where is the average potential outcome of all units in whole-plot for treatment combination . The second equality in (9) follows from (1). Also, from (3) and (9), it follows that
[TABLE]
Now define the following measure of heterogeneity of treatment contrasts across the whole-plots:
[TABLE]
where is given by (9). Then, extending the ideas of Zhao et al., (2018), after considerable algebra, we obtain the following result on the sampling variance of , the unbiased estimator of .
Theorem 1**.**
The sampling variance of is
[TABLE]
Next, to obtain an estimator of the sampling variance, we first define the counterparts of and in (5) and (6) in terms of the adjusted potential outcomes:
[TABLE]
Then it is easy to see from (5), (6) and (8) that
[TABLE]
Note that is the simple average of , , irrespective of whether are equal or not. This is precisely what the relationship between and in (6) reduces to when , providing us with the intuition to generalize the results of Zhao et al., (2018) by substituting the potential outcomes by their adjusted version in view of (12). We now define the following estimator of the sampling variance in Theorem 1:
[TABLE]
where
[TABLE]
These expressions now allow us to work along the lines of Zhao et al., (2018) by substituting (12) in (7). Again, considerable algebra yields the following result:
Theorem 2**.**
The variance estimator given by (13) estimates the sampling variance of with a nonnegative bias defined by (11), that is, .
Remark 1**.**
Theorem 2 shows that is a conservative estimator of with a non-negative bias . This property is in line with variance estimators in other situations of randomization based causal inference. Moreover, in the balanced case, by (11), the bias vanishes when . As observed by Zhao et al., (2018), this happens for every treatment contrast if and only if between-whole-plot additivity holds, which means
[TABLE]
for every pair of treatment combinations and . A disturbing feature of the variance estimator , however, emerges in the unbalanced case which is the main focus of this paper. Then remains biased even if between-whole-plot additivity holds, because by (9) and (10), condition (14) implies and hence
[TABLE]
which is positive when are not all equal unless . The situation remains unchanged even under the stronger assumption of strict additivity or homogeneity of treatment effects (Neyman, 1923), which enforces the constancy of over for every pair of treatment combinations and .
This property of described in Remark 1 is a matter of concern because a requirement typically imposed on a variance estimator in causal inference is that it should become unbiased at least under Neymannian strict additivity, if not under milder versions thereof such as between-whole-plot additivity in the present context. The estimator , obtained by generalizing the arguments in the balanced case fails to meet this requirement when are not all equal. In the rest of the paper, we investigate the existence of a variance estimator that overcomes this difficulty and show how, under wide generality, such an estimator can be obtained by appropriately modifying as given by (13).
A new variance estimator
We begin our search for an improved variance estimator by expanding the bias term defined in (11) as follows:
[TABLE]
Note that in (15), the term is not unbiasedly estimable, but for , allows unbiased estimation. This is because, by (9),
[TABLE]
The sums over and in (16) include the case . There is at least one pair of distinct treatment combinations and such that and is never observable as unit cannot be assigned simultaneously to both and . Hence, does not allow unbiased estimation. On the other hand, for , does not involve terms like , and is unbiasedly estimable. For each , let denote the level combination of the whole-plot factors assigned to whole-plot . Now define
[TABLE]
The following proposition now gives an unbiased estimator of :
Proposition 2**.**
For , , an unbiased estimator of is given by
[TABLE]
where is an indicator that equals one if and zero otherwise.
We can now use Proposition 2 to construct a new estimator of . Consider any symmetric matrix of order such that for . Now define the variance estimator
[TABLE]
where is the variance estimator defined in Section 3, and is as defined in Proposition 2. Then, from (15), (17), Theorem 2 and Proposition 2 it is easy to see that
[TABLE]
where
[TABLE]
Clearly, the bias is nonnegative, making a conservative estimator of if the matrix is nonnegative definite. Furthermore, by (18), this bias vanishes if and only if , when has each row sum zero, and is a positive semidefinite matrix of rank . These facts are summarized in Theorem 3, which is the main result of this section.
Theorem 3**.**
Let there exist a positive semidefinite matrix of order and satisfying the conditions: (c1) , (c2) , and (c3) rank. Then the variance estimator defined in (17) estimates with a nonnegative bias given by (18), which vanishes if and only if .
Remark 2**.**
Recall that the between-whole-plot additivity condition (14) is equivalent to for every treatment contrast. Thus, even when the whole-plot sizes are not all equal, by Theorem 3, the bias vanishes for every treatment contrast if and only if between-whole-plot additivity holds. Thus, if a positive semidefinite matrix satisfying conditions (c1)-(c3) is available, then Theorem 3 provides us with a variance estimator that possesses properties similar to the one derived by Zhao et al., (2018) for the balanced case. However, the issue of existence of such a matrix turns out to be quite challenging, and will be explored in the next section.
Existence and construction
We will now study the existence of a positive semidefinite matrix satisfying conditions (c1)-(c3) stated in Theorem 3 as a purely mathematical problem. Without loss of generality, we assume hereafter that
[TABLE]
To motivate the ideas, consider first the case , where conditions (c1) and (c2) determine uniquely as
[TABLE]
This matrix is also positive semidefinite and satisfies (c3) if and only if its principal minor, given by the first two rows and columns, is positive. Simplification of this condition and application of (19) yields as the necessary and sufficient condition for to satisfy (c1)-(c3). This construction of for raises the following questions with respect to the general case :
- (a)
Is the condition
[TABLE]
necessary and sufficient for existence of a positive semidefinite matrix satisfying (c1)-(c3)?
- (b)
If so, then under (21), can one construct such a matrix by an extension of the form in ( 20) to the general case?
Later in this section, Theorem 4 answers (a) in the affirmative. On the other hand, the question in (b) does not allow a conclusive answer. To see why, observe that the most obvious extension of (20) to general is given by , with
[TABLE]
The divisors in (22) ensure condition (c2) about zero row sums and make it consistent with (20) when . The form (22) is also natural because, in keeping with as the diagonal elements of , it takes the off-diagonal elements as linear combinations of in a systematic manner. However, unlike the case of , the matrix given by (22) may not be positive semidefinite for , even when the condition (21) holds. For instance, if , then this condition holds for both the configurations and . The matrix in (22) is positive semidefinite of rank 3 () for the first configuration, but has a negative eigenvalue for the second.
The above discussion makes it clear that, in general, the task of obtaining a positive semidefinite matrix satisfying (c1)-(c3) under condition (21) can be far more complex than what the form (20) arising for suggests.Theorem 4 establishes condition (21) as a necessary and sufficient condition for existence of such a matrix.
Theorem 4**.**
Let . Then condition (21), that is, , is necessary and sufficient for the existence of a positive semidefinite matrix of order and satisfying the conditions (c1) , (c2) , and (c3) rank.
The sufficiency part of the proof of Theorem 4 leads to a construction procedure of the matrix satisfying conditions (c1)-c(3). If , then one can simply take at each diagonal position of and at each off-diagonal position. Turning next to the case of unequal , suppose condition (21) holds. Let , where the prime denotes transposition, and let denote the vector of ones. Then the steps involved in the construction of the matrix are:
- Step 1: Find a vector with elements satisfying the condition
[TABLE]
- Step 2: Find nonnegative constants and , satisfying and the following condition:
[TABLE]
- Step 3: Construct the following matrix:
[TABLE]
where , and are obtained from steps 1 and 2 above, is the identity matrix of order and .
- Step 4: Construct matrix as follows:
[TABLE]
Then is positive semidefinite of order and satisfies (c1)-(c3) by the proof of the sufficiency part of Theorem 4. A lemma, crucial in this proof, appears in the supplementary material and guarantees the existence of vector in step 1 and constants and in step 2 under condition (21).
Remark 3**.**
It is satisfying that the condition (21) holds under wide generality. It only requires the largest whole-plot to be not too large compared to the others and holds, in particular, when there is a tie about the largest whole-plot.
Remark 4**.**
For , one can check that the construction stated above yields the unique in (20). For , however, a positive semidefinite matrix meeting (c1)-(c3) is non-unique. Indeed, then the above construction itself can yield a wide class of such matrices considering all vectors which satisfy (23), and for each such , all nonnegative satisfying and ((b)). Thus, the issue of discriminating among rival choices of becomes important. Such a discriminating strategy is discussed in Section 6.
Minimax estimators unbiased under between-whole-plot additivity
As seen in Section 5, while condition (21) guarantees the existence of matrix and consequently a variance estimator that is unbiased under between-whole-plot additivity, such a matrix is non-unique. Thus, it is important to define a criterion that can discriminate among possible choices of . Clearly, a good choice should control the bias given by (18) that is associated with the estimation of . The hurdle here is that, are unknown. Even the idea of minimaxity does not work without further refinement, because is positive semidefinite, and hence is unbounded with respect to variation of in the -dimensional real space. On the other hand, by (10), multiplication of by any nonzero constant only rescales the treatment contrast , without essentially altering it. We, therefore, consider minimization of subject to . This is motivated by Mukerjee et al., (2018) who touched upon split-plot designs only in the balanced case. It is easy to see that the above formulation calls for obtaining , subject to (c1)-(c3), so as to minimize , the largest eigenvalue of . The following proposition provides us with a lower bound for .
Proposition 3**.**
For any positive semidefinite matrix satisfying (c1)-(c3), a lower bound for is given by , but this bound is unattainable whenever are not all equal.
Given Proposition 3, an analytical solution to the minimaxity problem above seems to be intractable in the unbalanced case. This is anticipated, because a complete characterization of matrices satisfying (c1)-(c3) is hard, even though in Section 5, we were able to outline a general method for constructing such matrices when condition (21) holds. As a practical strategy, therefore, it makes sense to concentrate on matrices that can be obtained via this method, with a view to minimizing among these matrices. It is reassuring that even then the class of competing matrices is quite large, as noted in Remark 4.
Example 1**.**
Returning to the school example in Sections 1 and 2, where we have , and , the smallest obtainable via steps 1 through 4 described in Section 5 is 192, which corresponds to
[TABLE]
as given by , and .
Simulation Results
Whereas Theorem 3 establishes unbiasedness of under (21) and between-whole-plot additivity, and consideration of minimaxity is expected to provide protection under extreme departures from additivity, it is also important to understand how the bias of would compare to that of under different levels of treatment effect heterogeneity. We now conduct some simulations to study this aspect. We consider the estimation of the interaction effect between factors and in the setting of Example 1. The unit-level treatment contrast equals for (Dasgupta et al., 2015). The finite population contrast of interest is . The vector of potential outcomes for unit , denoted by , is generated using the multivariate normal model:
[TABLE]
where
[TABLE]
is the covariance matrix for whole-plot that depends on two parameters: the variance and correlation . Matrices and respectively denote the th order identity matrix and the matrix of ones. Eight possible scenarios (listed in Table 1) for generating the potential outcomes are considered.
Strict additivity holds for population I. The potential outcomes for population II are forced to to ensure, via an appropriate command in R, that the whole-plot means are always one. Population III generates different but guarantees the same within each whole-plot. Populations IV through VIII differ only with respect to the correlation parameters that lead to different types of treatment effect heterogeneity. These include all zero correlations in population VI, all negative correlations in population VII, and a mix of positive and negative correlations in population VIII.
From each population, 200 sets of potential outcomes are generated, and the biases of variance estimators and are compared. Note that these biases are given by (11) and given by (18). We also calculate the bias ratio for each population. The results for populations I and II are consistent with our results. In both of these cases, is always zero and is always 0.0133. Boxplots of the distributions of and for populations III-VIII are shown in Figure 1. The median bias ratios for these populations are 0.804, 0.811, 0.811, 0.810, 0.822 and 0.817 respectively. The plots and the median bias ratios establish the robustness of the new estimator with respect to controlling bias under various forms of treatment effect heterogeneity.
Acknowledgement
This work was supported by the J.C. Bose National Fellowship, Government of India, and grants from Indian Institute of Management Calcutta and National Science Foundation, USA.
Appendix: Proofs of results
In what follows, and denote unconditional expectation and covariance with respect to the randomization at the whole-plot stage, while and denote expectation and covariance with respect to the randomization at the sub-plot stage, conditional on the whole-plot stage assignment.
Proof of Proposition 1.
Follows from straightforward conditioning arguments.
∎
Proof of Theorem 1.
Recall that
[TABLE]
Consequently,
[TABLE]
Defining as an indicator that equals one if and zero otherwise, we have
[TABLE]
Next,
[TABLE]
so that
[TABLE]
Hence,
[TABLE]
Since , we have that
[TABLE]
Substituting the expression of from (27) in the above, the first two terms in the expression of in Theorem 1 follow immediately. The last term can be explained as
[TABLE]
∎
Proof of Theorem 2.
[TABLE]
where , and is similarly defined. For any ,
[TABLE]
Thus,
[TABLE]
The result stated in Theorem 2 is evident from the above. ∎
Proof of Proposition 2.
Because , by (5) and the definition of , conditionally on the assignment of the whole-plots to the level combinations of the whole-plot factors, and are independent and the conditional expectation of their product equals
[TABLE]
The result now follows from (9), noting that the pair equals any with probability . ∎
Proof of the necessity part of Theorem 4.
Suppose a positive semidefinite matrix of order and satisfying (c1)-(c3) exists. Then by (c1),
[TABLE]
Hence using (c2), (28), and (c1) in succession,
[TABLE]
which implies . If possible, let equality hold here. Then equality holds throughout in (29), and invoking (28), this yields
[TABLE]
For any such that , by (c1) and (30), the principal minor of , as given by its th, th and th rows and columns turns out to be . Because this principal minor is nonnegative due to positive semidefinite-ness of , it follows that . This, in conjunction with (c1) and (30), implies that , where . But then , and (c3) is violated. This contradiction proves the necessity of the condition . ∎
To prove the sufficiency part of Theorem 4, we first state a lemma that is crucial in this proof and also leads to the algorithm for construction of the symmetric positive semidefinite matrix of order that satisfies conditions (c1)-(c3).
Lemma 1**.**
Let . Suppose are not all equal and , as per (19). Let denote the vector of ones and .
- (a)
Then there exists a vector with elements such that .
- (b)
If, in addition, condition (21) holds, i.e., , then, with the vector as in (a) above, there exist nonnegative constants satisfying , such that equation (24) holds, i.e.,
[TABLE]
Proof of Lemma 1.
Part (a). It will suffice to show that there exist , each or , such that . One can then simply take . Recall that , as per (19). Because are not all equal, this yields
[TABLE]
Let be the largest nonnegative integer such that
[TABLE]
By (31), . If , define
[TABLE]
and note that
[TABLE]
because by (19) and (32), for . Now, if , then with and as in (33), , by (31) and (34).
Next, let . Then, by (19),
[TABLE]
Let be the largest integer in such that . If , then . So, with , and as in (33) when ,
[TABLE]
by (34).
Now, suppose , in which case . Then,
[TABLE]
As a result, either
[TABLE]
Else,
[TABLE]
Adding these two inequalities, we have , which is impossible by the definition of , because .
If (i) holds, then the choice , , coupled with as in (33) when , entails , by (34). Similarly, if (ii) holds, then the choice , , coupled with as in (33) when , entails .
Part (b): Let , and let the vector be as in part (a) above, so that . Let , and . Then , as . As a result, there exist constants and such that and . Let . Then . Hence, if we take , , then and , because is a weighted average of and , both of which are less than one. Moreover, by the definition of , i.e., and satisfy (24). ∎
Proof of the sufficiency part of Theorem 4.
In view of Lemma 1, this follows from steps 1-4 in Section 5, noting that (i) the matrix there is positive definite, and hence the matrix there is positive semidefinite of rank with each row sum zero, (ii) has diagonal elements , and (iii) by (24),
[TABLE]
because .
∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Box et al., (2005) Box, G. E. P., Hunter, J. S., and Hunter, W. G. (2005). Statistics for Experimenters: Design, Innovation, and Discovery . John Wiley & Sons, Hoboken, New Jersey, 2nd edition.
- 2Cochran, (1977) Cochran, W. G. (1977). Sampling Techniques . John Wiley & Sons: New York.
- 3Cochran and Cox, (1957) Cochran, W. G. and Cox, G. M. (1957). Experimental Designs . John Wiley & Sons, Hoboken, New Jersey, 2nd edition.
- 4Dasgupta et al., (2015) Dasgupta, T., Pillai, N. S., and Rubin, D. B. (2015). Causal inference for 2 K superscript 2 𝐾 2^{K} factorial designs by using potential outcomes. Journal of the Royal Statistical Society, Series B , 77(4):727–753.
- 5Fisher, (1925) Fisher, R. A. (1925). Statistical Methods for Research Workers . Oliver & Boyd, Edinburgh, Scotland.
- 6Fisher, (1935) Fisher, R. A. (1935). The Design of Experiments . Oliver & Boyd, Oxford, England, 1st edition.
- 7Freedman, (2006) Freedman, D. A. (2006). Statistical models for causation: What inferential leverage do they provide? Evaluation Review , 30:691–713.
- 8Freedman, (2008) Freedman, D. A. (2008). On regression adjustments to experimental data. Advances in Applied Mathematics , 40:180–193.
