Locally optimal designs for generalized linear models within the family of Kiefer $\Phi_k$-criteria
Osama Idais

TL;DR
This paper develops analytic solutions for locally optimal experimental designs in generalized linear models using Kiefer $\
Contribution
It introduces a general framework for deriving analytic locally optimal designs under Kiefer $\
Findings
Analytic solutions for D- and A-optimal designs are provided.
Necessary and sufficient conditions for optimality are established.
Designs are characterized via intensity values using the General Equivalence Theorem.
Abstract
Locally optimal designs for generalized linear models are derived at certain values of the regression parameters. In the present paper a general setup of the generalized linear model is considered. Analytic solutions for optimal designs are developed under Kiefer -criteria highlighting the D- and A-optimal designs. By means of The General Equivalence Theorem necessary and sufficient conditions in term of intensity values are obtained to characterize the locally optimal designs. In this context, linear predictors are assumed constituting first order models with and without intercept on appropriate experimental regions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimal Experimental Design Methods · Manufacturing Process and Optimization
Locally optimal designs for generalized linear models
within the family of Kiefer -criteria
Osama Idais
Institute for Mathematical Stochastics, Otto-von-Guericke University Magdeburg,
PF 4120, 39016 Magdeburg, Germany
Abstract
Locally optimal designs for generalized linear models are derived at certain values of the regression parameters. In the present paper a general setup of the generalized linear model is considered. Analytic solutions for optimal designs are developed under Kiefer -criteria highlighting the D- and A-optimal designs. By means of The General Equivalence Theorem necessary and sufficient conditions in term of intensity values are obtained to characterize the locally optimal designs. In this context, linear predictors are assumed constituting first order models with and without intercept on appropriate experimental regions.
keywords:
generalized linear model, approximate design, The General Equivalence Theorem, intercept term , locally optimal design , analytic solution.
1 Introduction
The generalized linear model (GLM) was developed by Nelder and Wedderburn (1972). It is viewed as a generalization of the ordinary linear regression which allows continuous or discrete observations from one-parameter exponential family distributions to be combined with explanatory variables (factors) via proper link functions. Therefore, wide applications can be addressed by GLMs such as social and educational sciences, clinical trials, insurance, industry. In particular; logistic and probit models are used for binary observations whereas Poisson models and gamma models are used for count and nonnegative continuous observations, receptively (Walker and Duncan (1967), Myers and Montgomery (1997), Fox (2015), Goldburd et al. (2016)). Methods of likelihood are utilized to obtain the estimates of the model parameters. The precision of these maximum likelihood estimates (MLEs) is measured by their variance-covaraince matrix. In ordinary regression models for which normality assumption is realized the variance-covariance matrix is exactly (proportional to) the inverse of the Fisher information matrix. In contrast, for the GLMs the observations are often non-normal, and therefore large sample theory is demanded for the statistical inference. In this context, the variance-covariance matrix is approximately the inverse of the Fisher information matrix. It should, however, be emphasized that the Fisher information matrix for GLMs depends on the model parameters. The theory of generalized linear models is presented carefully in McCullagh and Nelder (1989) and Dobson and Barnett (2018).
While deriving optimal designs is obtained by minimizing the variance-covariance matrix there is no loss of generality to concentrate on maximizing the Fisher information matrix. For generalized linear models the optimal design cannot be found without a prior knowledge of the parameters (Khuri et al. (2006), Atkinson and Woods (2015)). One approach which so-called local optimality was proposed by Chernoff (1953) aiming at deriving a locally optimal design at a given parameter value (best guess). This approach is widely employed for GLMs, for instance; for count data with Poisson models and Rasch Poisson model see Wang et al. (2006), Russell et al. (2009) and Graßhoff, Holling, and Schwabe (2013, 2015, 2018). For binary data: see Abdelbasit and Plackett (1983) and Mathew and Sinha (2001) under logistic models and Biedermann et al. (2006) under dose-response models whereas under logit, log-log and probit models see Yang et al. (2012). Furthermore, Gaffke et al. (2019) provided locally D- and A-optimal designs for gamma models. In particular, optimal designs for GLMs without intercept have not been considered carefully. Kabera et al. (2015) provided analytic proofs of D-optimal designs for zero intercept parameters of a two-binary-factor logistic model with no interaction. Recently, Idais and Schwabe (2019) introduced locally D- and A-optimal designs for gamma models without intercept.
Locally optimal designs for a general setup of generalized linear models received some attention. Geometrically, Ford et al. (1992) considered only one continuous factor. Atkinson and Haines (1996) presented a study of optimal designs for nonlinear model including GLMs. Yang (2008) provided optimal designs for GLMs with applications to logistic and probit models. Also a general solution for GLMs was given in Yang and Stufken (2009). Analytic solutions under D-criterion were obtained by Tong et al. (2014) for particular limitations.
The paper is organized as follows. In Section 3 we present some approaches to determine the optimal weights for particular designs under D-, A- and -criteria which will be used in the subsequent sections. Throughout, with the aid of The Equivalence Theorem we establish a necessary and sufficient condition for a design to be locally D-, A- or -optimal designs. We begin with the single-factor model by Section 4. In Section 5 we consider first order models with intercept. In Section 6 we focus on Kiefer -criteria for first order models without intercept.
2 Preliminary
In the following subsections we introduce the GLMs and the required notations of optimal design theory.
2.1 Model specification
In the context of the generalized linear models the observations (responses) belong to a one-parameter exponential family. The probability density function of a response variable defined as
[TABLE]
where and are known functions whereas is a canonical parameter and is a dispersion parameter. A common computational method for fitting the models to data are provided in the GLM framework. That is the expected mean is given by , and the variance is given by . The quantity is called the mean-variance function or equivalently, the variance function of the expected mean, i.e., . Thus we may write which depends on the values of (see McCullagh and Nelder (1989), Section 2.2.2).
Consider the experimental region to which the covariate value belongs. Denote by the parameter vector. Let be a -dimensional regression function, i.e., where the components are real-valued continuous linearly independent functions. The generalized linear model can be introduced as
[TABLE]
where is a link function that relates the expected mean to the linear predictor . It is assumed that is one-to-one and differentiable. One can realize that \mu=\mu(\boldsymbol{x},\boldsymbol{\beta})=g^{-1}\big{(}\boldsymbol{f}^{\sf T}(\boldsymbol{x})\boldsymbol{\beta}\big{)} and \mathrm{d}\eta/\mathrm{d}\mu=g^{\prime}\bigl{(}g^{-1}\big{(}\boldsymbol{f}^{\sf T}(\boldsymbol{x})\boldsymbol{\beta}\big{)}\bigr{)} and therefore, we can define the intensity function at a point as
[TABLE]
which is positive and depends on the value of linear predictor . The intensity function is regarded as the weight for the corresponding unit at the point (Atkinson and Woods (2015)).
The Fisher information matrix for a GLM at (see Fedorov and Leonov (2013), Subsection 1.3.2) has the form
[TABLE]
For the whole experimental points the Fisher information matrix reads as
[TABLE]
The information matrix of the form (2.4) is appropriate for other nonlinear models, e.g., The survival times observations which depend on the proportional hazard model (Schmidt and Schwabe (2017)). Moreover, under homoscedastic regression models the intensity function is constant equal to whereas, under heteroscedastic regression models we get intensity that is equal to which depends on only and thus we have information matrix of form that does not depend on the model parameters. The latter case was discussed in Graßhoff et al. (2007) and in the book by Fedorov and Leonov (2013), p.13.
It is worthwhile mentioning that unlike the normally-distributed response variables, the sampling distributions for MLEs in GLMs that used for inference cannot be determined exactly. Therefore, the statistical inferences for GLMs are conducted for large sample sizes under mild regularity assumptions on the probability density (2.1). Hence,
[TABLE]
where (Fahrmeir and Kaufmann (1985), Theorem 3). Moreover, the variance-covariance matrix of is approximately given by the inverse of the Fisher information matrix (2.5), see Fedorov and Leonov (2013), Section 1.5.,
[TABLE]
2.2 Optimal designs
Throughout the present work we will deal with the approximate (continuous) design theory, i.e., a design is a probability measure with finite support on the experimental region ,
[TABLE]
where , are pairwise distinct points and with . The set is called the support of and are called the weights of , see Silvey (1980), p.15. The information matrix of a design from (2.7) at a parameter point is defined by
[TABLE]
One might recognize as a convex combination of all information matrices for all design points of . Another representation of the information matrix (2.8) can be utilized based on the design matrix and the weight matrix and hence we can write
[TABLE]
Remark**.**
A particular type of designs appears frequently when the support size equals the dimension of , i.e., . In such a case the design is minimally supported and it is often called a minimal-support or a saturated design.
In this paper we focus on optimal designs within the family of Kiefer -criteria (Kiefer (1975)). Kiefer -criteria aim at minimizing the -norm of the eigenvalues of the variance-covariance matrix and include the most common criteria D-, A- and E- optimality. Denote by the eigenvalues of a nonsingular information matrix . Denote by “” and “” the determinant and the trace of a matrix, respectively. The Kiefer -criteria are defined by
[TABLE]
Note that , and are the D-, A- and E-criteria, respectively. A -optimal design minimizes the function over all designs whose information matrix is nonsingular. For the strict convexity of implies that the information matrix of a locally -optimal design (at ) is unique. That is, if and are two locally -optimal designs (at ) then (Kiefer (1975)). In particular, D-optimal designs are constructed to minimize the determinant of the variance-covariance matrix of the estimates or equivalently to maximize the determinant of the information matrix. The D-criterion is typically defined by the convex function \Phi_{\mathrm{D}}(\boldsymbol{M}(\xi,\boldsymbol{\beta}))=-\log\det\big{(}\boldsymbol{M}(\xi,\boldsymbol{\beta})\big{)}. Geometrically, the volume of the asymptotic confidence ellipsoid is inversely proportional to \sqrt{\det\big{(}\boldsymbol{M}(\xi,\boldsymbol{\beta})\big{)}} where \det\big{(}\boldsymbol{M}(\xi,\boldsymbol{\beta})\big{)} can be determined by the inverse of the product of the squared lengths of the axes. Therefore, the D-optimal designs minimize the volume of the asymptotic confidence ellipsoid.
A-optimal designs are constructed to minimize the trace of the variance-covariance matrix of the estimates, i.e., to minimize the average variance of the estimates. The A-criterion is typically defined by \Phi_{\mathrm{A}}\big{(}\boldsymbol{M}(\xi,\boldsymbol{\beta})\big{)}={\rm tr}\bigl{(}\boldsymbol{M}^{-1}(\xi,\boldsymbol{\beta})\bigr{)}. The A-criterion minimizes the sum of the squared lengths of the axes of the asymptotic confidence ellipsoid. Moreover, E-optimal designs maximize the smallest eigenvalue of and equivalently, they minimize the squared length of the ‘largest’ axis of the asymptotic confidence ellipsoid.
In order to verify the local optimality of a design The General Equivalence Theorem is usually employed (see Silvey (1980), p.54 and Atkinson et al. (2007), p.137). It provides necessary and sufficient conditions for a design to be optimal and thus the optimality of a suggested design can be easily verified or disproved. The most generic one is the celebrated Kiefer-Wolfowitz equivalence theorem under D-criterion (Kiefer and Wolfowitz (1960)). The design is -optimal if and only if
[TABLE]
Furthermore, if the design is -optimal then inequality (2.9) becomes equality at its support.
Remark**.**
The left hand side of condition (2.9) of The General Equivalence Theorem is called the sensitivity function.
3 Determination of locally optimal weights
In this section we provide the optimal weights of the designs that will be derived throughout the paper with respect to Kiefer -criteria, and in particular the A-criterion () and the D-criterion (). In the current work we mostly deal with saturated designs (i.e., ) for generalized linear models. Let the support points are given by such that are linearly independent.
For the A-criterion () the optimal weights are given according to Pukelsheim (1993), Section 8.8, which has been modified in Gaffke et al. (2019). The design which achieves the minimum value of {\rm tr}\bigl{(}\boldsymbol{M}^{-1}(\xi,\boldsymbol{\beta})\bigr{)} over all designs with is given by
[TABLE]
where () and () are the diagonal entries of the matrix and \boldsymbol{F}=\bigl{[}\boldsymbol{f}(\boldsymbol{x}_{1}^{*}),\ldots,\boldsymbol{f}(\boldsymbol{x}_{p}^{*})\bigr{]}^{\sf T}.
For the D-criterion () the optimal weights are given by (), see Lemma 5.3.1 of Silvey (1980). That is the locally D-optimal saturated design assigns equal weights to the support points. On the other hand, there is no unified formulas for the optimal weights of a non-saturated design specifically, with respect to D-criterion. However, let the model be given with parameter vector of dimension , i.e., . The next lemma provides the optimal weights of a design with four support points under certain conditions.
Lemma 3.1**.**
Let be given such that the vectors , , , are linearly independent. For a given parameter point let for all . Denote
[TABLE]
such that for all . Assume that and . Then the design which achieves the minimum value of -\log\det\bigl{(}\boldsymbol{M}(\xi,\boldsymbol{\beta})\bigr{)} over all designs with is given by where
[TABLE]
Proof.
Let \boldsymbol{f}_{\ell}=\boldsymbol{f}(\boldsymbol{x}_{\ell}^{*})=\big{(}f_{\ell 1},f_{\ell 2},f_{\ell 3}\big{)}^{\sf T}\,\,(1\leq\ell\leq 4). The design matrix is given by \boldsymbol{F}=\bigl{[}\boldsymbol{f}_{1},\boldsymbol{f}_{2},\boldsymbol{f}_{3},\boldsymbol{f}_{4}\bigr{]}^{\sf T}. Denote \boldsymbol{V}={\rm diag}\bigl{(}\omega_{\ell}u_{\ell}\bigr{)}_{\ell=1}^{4}. Then and by the Cauchy-Binet formula the determinant of is given by the function where
[TABLE]
By assumptions , the function is invariant w.r.t. permuting and , i.e., and thus minimizing (3.1) has similar solutions for and . Thus we can write then (3.1) reduces to
[TABLE]
where , , . Thus we obtain the system of two equations , . Straightforward computations show that the solution of the above system is the optimal weights presented by the lemma. Hence, these optimal weights minimizing . ∎
Moreover, saturated designs under Kiefer -criteria for a GLM without intercept are of our interest, in specific, under the first order model and a parameter vector . Therefore, the choice of locally -optimal weights which yields the minimum value of over all saturated designs with the same support are given by the next lemma.
Lemma 3.2**.**
Consider a GLM without intercept with on the experimental region . Denote by for all the -dimensional unit vectors. Let for all be design points in such that the vectors are linearly independent. Let be a given parameter point. Let for all . For a given positive real vector the design which achieves the minimum value of over all designs with assigns weights
[TABLE]
*to the corresponding design points in .
For D-optimality (), .
For A-optimality (), .
For E-optimality (), .*
Proof.
Define the design matrix with the weight matrix . Then we have \boldsymbol{M}\bigl{(}\xi_{\boldsymbol{a}},\boldsymbol{\beta}\bigr{)}=\boldsymbol{F}^{\sf T}\boldsymbol{V}\boldsymbol{F}=\mathrm{diag}(a_{i}^{2}u_{i}\omega_{i})_{i=1}^{\nu} and \boldsymbol{M}^{-k}\bigl{(}\xi_{\boldsymbol{a}},\boldsymbol{\beta}\bigr{)}=\mathrm{diag}\big{(}(a_{i}^{2}u_{i}\omega_{i})^{-k}\big{)}_{i=1}^{\nu} with \mathrm{tr}\big{(}\boldsymbol{M}^{-k}(\xi_{\boldsymbol{a}},\boldsymbol{\beta})\big{)}=\sum\limits_{i=1}^{\nu}(a_{i}^{2}u_{i}\omega_{i})^{-k}. Note that the eigenvalues of \boldsymbol{M}^{-k}\bigl{(}\xi_{\boldsymbol{a}},\boldsymbol{\beta}\bigr{)} are given by . Thus the Kiefer -criteria can be defined as
[TABLE]
Now we aim at minimizing such that and . We write then (3.2) becomes
[TABLE]
It is straightforward to see that the equation is equivalent to
[TABLE]
which gives \omega_{i}=\Big{(}a_{\nu}^{2}u_{\nu}/(a_{i}^{2}u_{i})\Big{)}^{\frac{k}{k+1}}\omega_{\nu} , thus . This means are all equal, i.e., , where . It implies that . Due to we get , and thus c=\bigl{(}\sum\limits_{i=1}^{\nu}(a_{i}^{2}u_{i})^{\frac{-k}{k+1}}\bigr{)}^{-1}. So we finally obtain \omega_{i}=(a_{i}^{2}u_{i})^{\frac{-k}{k+1}}/\bigl{(}\sum\limits_{i=1}^{\nu}(a_{i}^{2}u_{i})^{\frac{-k}{k+1}}\bigr{)} for all which are the optimal weights given by the lemma. ∎
4 Single-factor model
In this section we concentrate on the simplest case for which the model is composed by a single factor through the linear predictor
[TABLE]
Let the experimental region is taken to be the continues unit interval . We introduce the function
[TABLE]
with constant . The function will be utilized for the characterization of the optimal designs. Consider the following conditions:
() is positive and twice continuously differentiable.
() is strictly increasing on .
() is an injective (one-to-one) function.
Recently, Lemma 1 in Konstantinou et al. (2014) showed that under the above conditions ()-() with a locally D-optimal design on is only supported by two points and where . In what follows analogous result is presented for locally optimal designs under various optimality criteria.
Lemma 4.1**.**
Consider model and experimental region . Let a parameter point be given. Let conditions ()-() be satisfied. Denote by a positive definite matrix and let be constant. Then if the condition of The General Equivalence Theorem is of the form
[TABLE]
then the support points of a locally optimal design is concentrated on exactly two points and where .
Proof.
Let . Then let which is a polynomial in of degree 2 where . Hence, by The Equivalence Theorem is locally optimal (at ) if and only if
[TABLE]
The above inequality is similar to that obtained in the proof of Lemma 1 in Konstantinou et al. (2014) and thus the rest of our proof is analogous to that. ∎
Obviously, under D-optimality we have and whereas, under A-optimality we have c=\mathrm{tr}(\boldsymbol{M}^{-1}(\xi^{*},\boldsymbol{\beta}))=\bigl{(}\sqrt{(a^{2}+1)/u_{b}}+\sqrt{(b^{2}+1)/u_{a}}\bigr{)}/(b-a)^{2} where and and . In general, under Kiefer -criteria we denote and .
As a consequence of Lemma 4.1, we next provide sufficient conditions for a design whose support is the boundaries of , i.e., [math] and to be locally D- or A-optimal on at a given . Let and denote and .
Theorem 4.1**.**
*Consider model \boldsymbol{f}(x)=\bigl{(}1,x\bigr{)}^{\sf T} and experimental region . Let a parameter point be given. Let be positive, twice continuously differentiable. Then:
() The unique locally D-optimal design (at ) is the two-point design supported by
[math] and with equal weights if*
[TABLE]
*() The unique locally A-optimal design (at ) is the two-point design supported by
[math] and with weights*
[TABLE]
if
[TABLE]
Proof.
Ad () Employing condition (2.9) of The Equivalence Theorem for implies that is locally D-optimal if and only if
[TABLE]
Since the support points are the l.h.s. of the inequality above equals zero at the boundaries of . Then it is sufficient to show that the aforementioned l.h.s. is convex on the interior points and this convexity realizes under condition (4.1) asserted in the theorem
Ad () This case can be shown in analogy to case () by employing condition (2.9) of The Equivalence Theorem for with . ∎
Now consider the discrete experimental region , i.e., the factor is binary. In view of Lemma 4.1 and Section 3 we provide locally D- and A-optimal designs in the next theorem.
Theorem 4.2**.**
*Consider model \boldsymbol{f}(x)=\bigl{(}1,x\bigr{)}^{\sf T} and experimental region with real numbers . Let a parameter point be given. Let and . Then:
() The unique locally D-optimal design (at ) is the two-point design supported by
and with equal weights .
() The unique locally A-optimal design (at ) is the two-point design supported by
and with weights*
[TABLE]
The locally D-optimal design given in the previous theorem is independent of the intensities, i.e., it is the same for all generalized linear models. Similar results for Poisson models were indicated in Wang et al. (2006). However, for each setup (or each intensity form) of generalized linear models there is a locally A-optimal design that even varies with parameter values. Since and are the only design points there is a locally D- or A-optimal design at any parameter value in the parameter space of .
5 Multiple regression models
In this section we consider a first order model with multiple factors,
[TABLE]
The linear predictor is determined by with binary factors. That is a discrete experimental region is considered and has the form . We aim at constructing locally D- and A-optimal designs for a given parameter point adopting particular analytic solutions.
To this end, we firstly begin with a two-factor model
[TABLE]
The experimental region can be written as . Let us denote the design points by , , , and . The following results that are provided in Theorem 5.1 and Theorem 5.2 under generalized linear model (5.2) are extensions of the corresponding results that were obtained in Gaffke et al. (2019) under a gamma model, i.e., a GLM with inverse link function , where with intensity and the unit cube as an experimental region. The proofs are analogous to those in the reference.
Theorem 5.1**.**
*Consider model (5.2) and experimental region . For a given parameter point let (). Denote by the intensity values rearranged in ascending order. Then:
(o) The locally D-optimal design (at ) is unique.
() If then is a three-point design supported by the three design points whose intensity values are given by , , , with equal weights .
() If then is a four-point design supported by the four design points with weights which are uniquely determined by the condition*
[TABLE]
Remark**.**
It is already seen from the optimality conditions asserted in part () of Theorem 5.1 that the design points with highest intensities perform as a support of a locally D-optimal design at a given parameter value. 2. 2.
The optimality condition asserted in part () of Theorem 5.1 applies only when the optimality conditions for the three-point (saturated) designs in () cannot be satisfied.
Theorem 5.1 covers various results in the literature. For examples; see Yang et al. (2012) for binary responses with several link functions and see Graßhoff et al. (2013) for count data in item response theory.
Now consider the case of equally effect sizes; i.e., . Next we give explicit formulas for the weights of locally D-optimal four-point designs at parameter points .
Corollary 5.1**.**
Under the assumptions of Theorem 5.1 let the parameter point be given with such that assumption () of Theorem 5.1 is fulfilled. Then the locally D-optimal design (at ) is supported by the four design points with weights
[TABLE]
Proof.
Since assumption () of Theorem 5.1 is fulfilled by a point the design is supported by all points , , , . Then the optimal weights are obtained in view of Lemma 3.1 where we have and . Hence, the results follow. ∎
In analogy to Theorem 5.1 we introduce locally A-optimal designs in the next theorem where also the design points with highest intensities perform as a support of a locally A-optimal design at a given parameter value.
Theorem 5.2**.**
*Consider the assumptions and notations of Theorem 5.1. Denote . Then:
(o) The locally A-optimal design (at ) is unique.
- (i)
If then
[TABLE] 2. (ii)
If then
[TABLE] 3. (iii)
If then
[TABLE] 4. (iv)
If then
[TABLE]
For each case () – (), the constant appearing in the weights equals the sum of the numerators of the three ratios. If none of the cases () – () applies then is supported by the four design points .
In the following we consider model (5.1) for a general number of factors, , and with the experimental region . Here, we are interested in providing an extension of locally D- and A-optimal designs with support given in the preceding theorems under a two-factor model.
Theorem 5.3**.**
Consider model (5.1) with experimental region , where . Denote the design points by
[TABLE]
For a given parameter point let . Then the design which assigns equal weights to the design points for all is locally D-optimal (at ) if and only if
[TABLE]
Proof.
Define the design matrix \boldsymbol{F}=\bigl{[}\boldsymbol{f}(\boldsymbol{x}_{1}^{*}),\ldots,\boldsymbol{f}(\boldsymbol{x}_{\nu+1}^{*})\bigr{]}^{\sf T}, then
[TABLE]
We have
[TABLE]
where , , and denote the -dimensional row vector of zeros, the -dimensional column vector of ones, and the unit matrix, respectively. So, by condition (2.9) of The Equivalence Theorem for the design is locally D-optimal if and only if
[TABLE]
The l.h.s. of (5.6) reads as
[TABLE]
and hence it is obvious that (5.6) is equivalent to (5.4). ∎
Remark**.**
The D-optimal design under a two-factor model with support , , from Theorem 5.1 is covered by Theorem 5.3 for where condition (5.4) is equivalent to the inequality that is asserted in part () of Theorem 5.1. Moreover, Theorem 5.3 covers various results in the literature. For example; Russell et al. (2009) provided for the Poisson model a locally D-optimal saturated design on the continuous experimental region that is supported by , , , at .
In analogy to Theorem 5.3 we introduce locally A-optimal designs in the next theorem.
Theorem 5.4**.**
Consider the assumptions and notations of Theorem 5.3. Denote . Then the design which is supported by with weights
[TABLE]
is locally A-optimal (at ) if and only if for all \boldsymbol{x}=(x_{1},\ldots,x_{\nu})^{\sf T}\in\bigl{\{}0,1\bigr{\}}^{\nu}
[TABLE]
Proof.
As in the proof of Theorem 5.3 the design matrix and its inverse are given by (5.5) and we obtain
[TABLE]
This yields and for according to Section 3 with . An elementary calculation shows that the weights given in Section 3 for an A-optimal design coincide with the () as stated in the theorem. Now we show that the design is locally A-optimal if and only if (5.7) holds. Let \boldsymbol{U}={\rm diag}\bigl{(}u_{1},\ldots,u_{p}\bigr{)} and \boldsymbol{\Omega}={\rm diag}\bigl{(}\omega_{1}^{*},\ldots,\omega_{p}^{*}\bigr{)} with the weight matrix . Then we have
[TABLE]
Since \boldsymbol{U}^{-1/2}\boldsymbol{\Omega}^{-1}=c\,{\rm diag}\bigl{(}c_{11}^{-1/2},\ldots,c_{pp}^{-1/2}\bigr{)}, we obtain
[TABLE]
where \boldsymbol{C}^{*}={\rm diag}\bigl{(}c_{11}^{-1/2},\ldots,c_{pp}^{-1/2}\bigr{)}\,\boldsymbol{C}\,{\rm diag}\bigl{(}c_{11}^{-1/2},\ldots,c_{pp}^{-1/2}\bigr{)} So, together with condition (2.9) of The General Equivalence Theorem for the design is locally A-optimal (at ) if and only if
[TABLE]
Straightforward calculation shows that condition (5.7) that provides a characterization of local A-optimality of is equivalent to condition (5.8). ∎
Remark**.**
Theorem 5.4 with covers the result stated in case () of Theorem 5.2. It can be checked that, with the notations of Theorem 5.2, the inequality is equivalent to assumption (5.7) of Theorem 5.4 for .
6 Model without intercept
In this section we consider GLMs (2.2) having a linear predictor without intercept, i.e., the components for all () and thus for all (). Precisely, we focus on a first order model
[TABLE]
Next locally optimal designs will be derived under Kiefer -criteria and thus, the results implicitly cover the D-, A- and E-optimal designs. We provide necessary and sufficient conditions for constructing -optimal designs on a general experimental region . The support points are located at the boundaries of and the optimal weights are obtained according to Lemma 3.2.
Theorem 6.1**.**
Consider the experimental region . Given a vector where , . Let denote the design points that belong to . For a given parameter point denote . Let be the saturated design whose support is with the corresponding weights
[TABLE]
Then is locally -optimal (at ) if and only if
[TABLE]
Proof.
Define the design matrix with the weight matrix
[TABLE]
Then we have
[TABLE]
Adopting these formulas simplifies the l.h.s. of condition (2.9) of The Equivalence Theorem to u(\boldsymbol{x},\boldsymbol{\beta})\Big{(}\sum\limits_{i=1}^{\nu}(a_{i}^{2}u_{i})^{\frac{-k}{k+1}}\Big{)}^{k+1}\sum\limits_{i=1}^{\nu}u_{i}^{-1}a_{i}^{-2}x_{i}^{2} which is hence, bounded by \Big{(}\sum\limits_{i=1}^{\nu}(a_{i}^{2}u_{i})^{\frac{-k}{k+1}}\Big{)}^{k+1} if and only if condition (6.1) holds true. ∎
In particular, Theorem 6.1 states that for a given parameter point the locally D-optimal design () has wights and the locally A-optimal design () has weights whereas the locally E-optimal design () has weights .
Theorem 6.1 might be applicable for a wide class of GLMs on appropriate experimental regions. Consider a non-intercept gamma model, i.e., with and intensity . Let the experimental region is given by . Due to the positivity assumption of gamma models, i.e., the parameter point must satisfy the condition for all . Therefore, the parameter space is determined by , i.e., for all (). The next corollary is immediate.
Corollary 6.1**.**
Consider a non-intercept gamma model with on the experimental region and intensity . Given a vector where , . Let for all denote the design points belong to . For a given parameter point let be the saturated design whose support is with the corresponding weights
[TABLE]
Then is locally -optimal (at ).
Proof.
The corollary covers the result of Theorem 6.1 under a gamma model. For a given let . Thus . Then condition (6.1) of Theorem 6.1 is equivalent to for all . Since the condition holds true for any at any given . ∎
Corollary 6.1 covers Theorem 3.1 in Idais and Schwabe (2019) which provided locally D- and A-optimal designs for non-intercept gamma models. For a Poisson model, i.e., with intensity u(\boldsymbol{x},\boldsymbol{\beta})=\exp\big{(}\boldsymbol{x}^{\sf T}\boldsymbol{\beta}\big{)} and experimental region let us restrict to the case of , i.e., the design points are the unit vectors . As a result, condition (6.1) is simplified as presented in the following corollary.
Corollary 6.2**.**
Consider a non-intercept Poisson model with on the experimental region and intensity u(\boldsymbol{x},\boldsymbol{\beta})=\exp\big{(}\boldsymbol{x}^{\sf T}\boldsymbol{\beta}\big{)}. For a given parameter point define and denote by the descending order of . Let be the saturated design supported by the unit vectors with weights . Then is locally -optimal (at ) if and only if
[TABLE]
Proof.
The corollary covers the result of Theorem 6.1 under a Poisson model with intensity u(\boldsymbol{x},\boldsymbol{\beta})=\exp\big{(}\boldsymbol{x}^{\sf T}\boldsymbol{\beta}\big{)} and . So condition (6.1) reduces to
[TABLE]
For any define the index set such that if and else. So for described by and , if (i.e., ) then the l.h.s. of (6.3) is zero. If , inequality (6.3) becomes an equality. However, the l.h.s. of (6.3) is equal to which thus rewrites as or equivalently as . By the the descending order of we obtain for all subsets of same sizes ,
[TABLE]
Denote . Hence, inequality (6.3) is equivalent to for all . Then it is sufficient to show that
[TABLE]
For “”, let then . For “”, firstly, note that thus is true for . Now assume is true for some , i.e., and we want to show that it is true for . We can write
[TABLE]
∎
Remark**.**
One can slightly highlight on -optimality under a non-intercept linear model with on the continuous experimental region . Here, for all so the information matrices in a linear model are independent of . Note that Theorem 6.1 does not cover a non-intercept linear model on since condition (6.1) does not hold true for . However, the l.h.s. of condition (2.9) of The Equivalence Theorem under linear models, i.e. , is strictly convex and of course it attains its maximum at some vertices of . Thus the support of any (or D, A, E)-optimal design is a subset of . As a result, in particular for D- and A-optimality, one might apply the results of Theorem 3.1 in Huda and Mukerjee (1988) which were obtained under linear models on .
- •
For odd numbers of factors , the equally weighted designs supported by all such that is either D- or A-optimal.
- •
For even numbers of factors , the equally weighted design supported by all such that or is D-optimal. Moreover, the design which assigns equal weights to all points such that is A-optimal.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abdelbasit and Plackett (1983) Abdelbasit, K.M., Plackett, R.L., 1983. Experimental design for binary data. Journal of the American Statistical Association 78, 90–98.
- 2Atkinson et al. (2007) Atkinson, A., Donev, A., Tobias, R., 2007. Optimum Experimental Designs, with SAS. volume 34. Oxford University Press, Oxford.
- 3Atkinson and Haines (1996) Atkinson, A., Haines, L., 1996. 14 designs for nonlinear and generalized linear models, in: Ghosh, S., Rao, C. (Eds.), Design and Analysis of Experiments. Elsevier, Amsterdam. volume 13 of Handbook of Statistics , pp. 437–475.
- 4Atkinson and Woods (2015) Atkinson, A.C., Woods, D.C., 2015. Designs for generalized linear models, in: Angela Dean, Max Morris, J.S., Bingha, D. (Eds.), Handbook of Design and Analysis of Experiments. Chapman & Hall/CRC Press, Boca Raton, pp. 471–514.
- 5Biedermann et al. (2006) Biedermann, S., Dette, H., Zhu, W., 2006. Optimal designs for dose–response models with restricted design spaces. Journal of the American Statistical Association 101, 747–759.
- 6Chernoff (1953) Chernoff, H., 1953. Locally optimal designs for estimating parameters. Ann. Math. Statist. 24, 586–602.
- 7Dobson and Barnett (2018) Dobson, A.J., Barnett, A.G., 2018. An Introduction to Generalized Linear Models. Fourth edition ed., CRC press, Boca Raton.
- 8Fahrmeir and Kaufmann (1985) Fahrmeir, L., Kaufmann, H., 1985. Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models. Ann. Statist. 13, 342–368.
