Effect Size Estimation in Linear Mixed Models

J\"urgen Gro{\ss}; Annette M\"oller

arXiv:2302.14580·stat.ME·May 23, 2023

Effect Size Estimation in Linear Mixed Models

J\"urgen Gro{\ss}, Annette M\"oller

PDF

Open Access

TL;DR

This paper revisits Cohen's effect size measure $f^2$ within linear mixed models, demonstrating its calculation using R's lme4 package on simulated data, simplifying effect size estimation without needing a coefficient of determination.

Contribution

It introduces a method to compute Cohen's $f^2$ effect size in linear mixed models using standard software, avoiding complex calculations.

Findings

01

$f^2$ can be effectively computed in linear mixed models using R.

02

The method simplifies effect size estimation without requiring a coefficient of determination.

03

Application demonstrated on artificially generated data.

Abstract

In this note, we reconsider Cohen's effect size measure $f^{2}$ under linear mixed models and demonstrate its application by employing an artificially generated data set. It is shown how $f^{2}$ can be computed with the statistical software environment R using lme4 without the need for specification and computation of a coefficient of determination.

Equations34

y = X β + Z u + e,

y = X β + Z u + e,

X = (1_{n} : X_{1} : X_{2}),

X = (1_{n} : X_{1} : X_{2}),

\operator@font C o v (y) = σ^{2} V, V = Z D Z^{T} + T .

\operator@font C o v (y) = σ^{2} V, V = Z D Z^{T} + T .

β = (X^{T} V^{- 1} X)^{- 1} X^{T} V^{- 1} y

β = (X^{T} V^{- 1} X)^{- 1} X^{T} V^{- 1} y

σ^{2} = \frac{1}{ν} (y - X β)^{T} V^{- 1} (y - X β), ν = n - p,

σ^{2} = \frac{1}{ν} (y - X β)^{T} V^{- 1} (y - X β), ν = n - p,

F = \frac{( R β - r ) ^{T} ( R B R ^{T} ) ^{- 1} ( R β - r )}{r σ ^{2}} = \frac{( R β - r ) ^{T} ( R B R ^{T} ) ^{- 1} ( R β - r )}{( y - X β ) ^{T} V ^{- 1} ( y - X β )} \cdot \frac{ν}{r},

F = \frac{( R β - r ) ^{T} ( R B R ^{T} ) ^{- 1} ( R β - r )}{r σ ^{2}} = \frac{( R β - r ) ^{T} ( R B R ^{T} ) ^{- 1} ( R β - r )}{( y - X β ) ^{T} V ^{- 1} ( y - X β )} \cdot \frac{ν}{r},

\operator@font C o v (β) = σ^{2} B, B = (X^{T} V^{- 1} X)^{- 1} .

\operator@font C o v (β) = σ^{2} B, B = (X^{T} V^{- 1} X)^{- 1} .

f^{2} = \frac{( R _{1} β ) ^{T} ( R _{1} B R _{1}^{T} ) ^{- 1} ( R _{1} β )}{( y - X β ) ^{T} V ^{- 1} ( y - X β )},

f^{2} = \frac{( R _{1} β ) ^{T} ( R _{1} B R _{1}^{T} ) ^{- 1} ( R _{1} β )}{( y - X β ) ^{T} V ^{- 1} ( y - X β )},

f^{2} = \frac{R _{A, B}^{2} - R _{A}^{2}}{1 - R _{A, B}^{2}}

f^{2} = \frac{R _{A, B}^{2} - R _{A}^{2}}{1 - R _{A, B}^{2}}

D = diag [(σ_{1}^{2} / σ^{2}) I_{q_{1}}, \dots, (σ_{m - 1}^{2} / σ^{2}) I_{q_{m - 1}}]

D = diag [(σ_{1}^{2} / σ^{2}) I_{q_{1}}, \dots, (σ_{m - 1}^{2} / σ^{2}) I_{q_{m - 1}}]

V = I_{n} + i = 1 \sum m - 1 (σ_{i}^{2} / σ^{2}) Z_{i} Z_{i}^{T} .

V = I_{n} + i = 1 \sum m - 1 (σ_{i}^{2} / σ^{2}) Z_{i} Z_{i}^{T} .

\operator@font C o v (β) = σ^{2} B

\operator@font C o v (β) = σ^{2} B

f^{2} = \frac{1}{ν} (R_{1} β)^{T} (R_{1} \operator@font C o v (β) R_{1}^{T})^{- 1} (R_{1} β) .

f^{2} = \frac{1}{ν} (R_{1} β)^{T} (R_{1} \operator@font C o v (β) R_{1}^{T})^{- 1} (R_{1} β) .

d_{*} = f^{2} (n - 2 - w) γ,

d_{*} = f^{2} (n - 2 - w) γ,

\operator@font C o v (y) = σ^{2} V, V = k Z Z^{T} + I_{m},

\operator@font C o v (y) = σ^{2} V, V = k Z Z^{T} + I_{m},

R_{A B}^{2} = \frac{( r / ν ) F}{1 + ( r / ν ) F}, r = p - 1, ν = n - p,

R_{A B}^{2} = \frac{( r / ν ) F}{1 + ( r / ν ) F}, r = p - 1, ν = n - p,

f^{2} = \frac{R _{A, B}^{2} - R _{A}^{2}}{1 - R _{A, B}^{2}} = \frac{0.1539263 - 0.07418754}{1 - 0.1539263} = 0.09424569,

f^{2} = \frac{R _{A, B}^{2} - R _{A}^{2}}{1 - R _{A, B}^{2}} = \frac{0.1539263 - 0.07418754}{1 - 0.1539263} = 0.09424569,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Bayesian Modeling and Causal Inference · Statistical Methods and Inference

Full text

Effect Size Estimation in Linear Mixed Models

Jürgen Groß

Institute for Mathematics and Applied Informatics, University of Hildesheim, Germany

[email protected]

and

Annette Möller

Faculty of Business Administration and Economics, Bielefeld University, Germany

[email protected]

Abstract.

In this note, we reconsider Cohen’s effect size measure $f^{2}$ under linear mixed models and demonstrate its application by employing an artificially generated data set. It is shown how $f^{2}$ can be computed with the statistical software environment R using lme4 without the need for specification and computation of a coefficient of determination.

Key words and phrases:

Hypothesis testing, effect size, Cohen’s f2, linear regression, linear mixed model, multivariate normal distribution

2010 Mathematics Subject Classification:

62J05, 62J20, 62F03

Support of the second author by the Helmholtz Association’s pilot project ”Uncertainty Quantification” is gratefully acknowledged.

1. Introduction

In studies with a large number of observations, statistical testing procedures are prone to detect even minor departures from a null hypothesis yielding very small p-values. However, since significance does not automatically imply relevance, measures for the size of the effect associated with a possible rejection of the null hypothesis are often recommended as a useful tool, see e.g. Wilkinson (1999).

A well known effect size measure in a regression context when a quantitative variable $Y$ depends on independent regressors (quantitative and/or qualitative) is Cohen’s $f^{2}$ , see Cohen (1988, Chapt. 9). Under multivariate normality this measure is strongly related to an $F$ test of a linear hypothesis that a subset $B$ of independent variables does not substantially contribute to the explanation of $Y$ , given a set $A$ of independent regressors already included in the model. One may argue that even if the regression coefficients associated with variables $B$ significantly differ from zero, their contribution may be assessed as relevant when some meaningful size of their effect is measured.

In this light studies may additionally want to report $f^{2}$ values for each variable in a regression model given the others, see for example Tables 2 to 5 in Taylor et al. (2020). With regard to the procedure of calculating $f^{2}$ , the authors refer to Selya et al. (2012) who introduced a practical method in relation with SAS® software. This approach even carries over to linear mixed models, i.e. the case that some independent variables are associated with random effects rather than with fixed effects. Hence, the original approach by Cohen is generalized to a certain extent, involving the additional estimation of a variance-covariance matrix.

In the following we reconsider and discuss the generalization of $f^{2}$ to linear mixed models. We also point to an alternative computational procedure which turns out to be especially useful in context with the statistical software environment R (R Core Team, 2022) and well known package lme4, see Bates et al. (2015), for fitting linear mixed models. Our explanations are illustrated on the basis of an artificially generated data set.

2. Linear Mixed Model

Consider a linear mixed model (LMM) described by

[TABLE]

where $\bm{y}$ is an $n\times 1$ observable random vector. It is assumed that the $n\times p$ model matrix $\bm{X}$ of full column rank $p$ can be partitioned as

[TABLE]

where $\bm{1}_{n}$ denotes the $n\times 1$ vector of ones, while the $n\times p_{1}$ and $n\times p_{2}$ matrices $\bm{X}_{1}$ and $\bm{X}_{2}$ contain the values of $p_{1}+p_{2}=p-1$ regressors. The $p\times 1$ vector $\bm{\beta}$ comprises $p$ unknown parameters $\beta_{0},\beta_{1},\ldots,\beta_{p-1}$ addressed as fixed effects. The $n\times q$ matrix $\bm{Z}$ contains the values of independent variables associated with an $q\times 1$ vector $\bm{u}$ of unobservable random effects. It is assumed that $\bm{u}$ has expectation $\bm{0}_{q}$ (the $q\times 1$ vector of zeroes) and variance covariance matrix $\mathop{\operator@font Cov}\nolimits(\bm{u})=\sigma^{2}\bm{D}$ with unknown parameter $\sigma^{2}>0$ and $q\times q$ matrix $\bm{D}$ , which may depend on further unknown parameters. For the $n\times 1$ vector $\bm{e}$ of unobservable random errors it is assumed that ${\operator@font E}(\bm{e})=\bm{0}_{n}$ and $\mathop{\operator@font Cov}\nolimits(\bm{e})=\sigma^{2}\bm{T}$ , where the $n\times n$ positive definite matrix $\bm{T}$ may also depend on unknown parameters. Moreover, the assumption $\mathop{\operator@font Cov}\nolimits(\bm{u},\bm{e})=\bm{0}_{q,n}$ (the $q\times n$ matrix of zeroes) implies

[TABLE]

Then the above model may also be represented by the triplet $\{\bm{y},\bm{X}\bm{\beta},\sigma^{2}\bm{V}\}$ and can be considered as a special case of the general Gauss-Markov model, see e.g. Groß (2004). As a matter of fact, formulas useful under model 1) carry over from a classical regression model $\bm{Q}^{-1}\bm{y}=\bm{Q}^{-1}\bm{X}\bm{\beta}+\bm{\varepsilon}$ with ${\operator@font E}(\bm{\varepsilon})=\bm{0}_{n}$ and $\mathop{\operator@font Cov}\nolimits(\bm{\varepsilon})=\sigma^{2}\bm{I}_{n}$ , see Christensen (2020, Sect 2.7). Here $\bm{Q}$ denotes some nonsingular matrix satisfying $\bm{Q}\bm{Q}^{T}=\bm{V}$ . Let us assume for the moment that $\bm{V}$ is completely known. Then

[TABLE]

is the best linear unbiased estimator for $\bm{\beta}$ , and

[TABLE]

is the usual unbiased estimator for $\sigma^{2}$ . Consider the linear hypothesis $H_{0}:\bm{R}\bm{\beta}=\bm{r}$ versus $H_{1}:\bm{R}\bm{\beta}\not=\bm{r}$ for a given $r\times p$ matrix $\bm{R}$ of full row rank and a given $r\times 1$ vector $\bf{r}$ . The corresponding $F$ statistic in model (1) is

[TABLE]

where

[TABLE]

Under multivariate normality the statistic $F$ follows a $F_{r,f}$ distribution provided $H_{0}$ holds true.

2.1. Effect Size

Suppose that we are interested in the effect of the independent variables represented by the model matrix $\bm{X}_{1}$ , given the variables represented by $\bm{X}_{2}$ . The corresponding linear hypothesis reads $\bm{R}_{1}\bm{\beta}=\bm{0}_{p_{1}}$ with $\bm{R}_{1}=(\bm{0}_{p_{1}}:\bm{I}_{p_{1}}:\bm{0}_{p_{1},p_{2}})$ . Hence, by reasoning similar to Cohen (1988), an appropriate measure for the size of the effect based on the above $F$ statistic is provided by

[TABLE]

thereby removing the factor $\nu/r$ from the $F$ statistic. This generalizes Cohen’s $f^{2}$ in the sense that if $\bm{D}=\bm{0}_{q,q}$ , then the unobservable random vector $\bm{u}$ equals $\bm{0}_{q}$ with probability one, the LMM reduces to the usual linear model of fixed effects only, and $f^{2}$ becomes identical to the measure provided by formula (9.2.1) in Cohen (1988).

We note that it is also possible to compute $f^{2}$ as

[TABLE]

for appropriately defined coefficients of determination $R_{A,B}^{2}$ derived under the full model and $R_{A}^{2}$ derived under a reduced LMM assuming that variables represented by $\bm{X}_{1}$ are not present at all. Such a formula is the basis for the widely applied computational procedure suggested by Selya et al. (2012).

2.2. Operational Effect Size

Formula (8) for $f^{2}$ is only operational when $\bm{V}$ is completely known, a condition not met in practical applications. According to Harville (1977) a simple LMM is the ordinary mixed and random effects ANOVA model, also referred to as traditional variance components model, see Christensen (2019, Chapt. 5). Under this model, the variance-covariance matrix of $\bm{y}$ depends on a total of $m$ variance components. The $n\times q$ matrix $\bm{Z}$ is partitioned as $\bm{Z}=(\bm{Z}_{1}:\cdots:\bm{Z}_{m-1})$ , where each $n\times q_{i}$ matrix $\bm{Z}_{i}$ is the design matrix corresponding to a qualitative variable with a certain number of levels. In such a case one may assume

[TABLE]

where $\sigma^{2}>0$ and $\sigma_{1}^{2},\ldots\sigma_{m-1}^{2}\geq 0$ are $m$ unknown variance components. Then

[TABLE]

If $\sigma_{i}^{2}=0$ for some $i$ , then the corresponding $q_{i}\times 1$ random effect vector $\bm{u}_{i}$ equals $\bm{0}_{q_{i}}$ with probability one. Such a model, and also more general ones, may be fitted in R with the package lme4, see Bates et al. (2015). From the fitting procedure its is possible to obtain an estimate for the variance-covariance matrix

[TABLE]

of the estimated fixed effects parameter vector $\bm{\beta}$ . Then a corresponding estimate for $f^{2}$ is given as

[TABLE]

where the factor $1/\nu$ is required to compensate for the usage of $\widehat{\sigma}^{2}$ in the estimated variance-covariance matrix (12). The application of this formula is illustrated in the following section. The merit of formula (13) lies in the fact that it can be applied whenever an estimate (12) is available. But then, of course, the actual outcome also depends on the estimation procedure, implying that different estimation methods for (12) may also lead to (slightly) different actual values of (13). This, however, is not the topic of our discussion. Also, when applying (13) there is no need for computing coefficients of determination. Nonetheless it is possible to define an operational version of (9) which corresponds to (13). This is demonstrated in Sect. 3.4, where the $R^{2}$ measure discussed in Edwards et al. (2008) is used with an operational version of the $F$ statistic obtained in the same light as (13) by employing the very same variance-covariance estimate (12). Again, employing alternative computational methods or alternative measures $R^{2}$ in linear mixed models, see also Nakagawa and Schielzeth (2013); Nakagawa et al. (2017), may result in different actual values of $f^{2}$ .

3. Example

In the following we discuss the performance of measure $f^{2}$ for an artificially generated data set of $n=1000$ observations intended to further illustrate some computational aspects. For the sake of simplicity our model consists of two independent variables $X_{1}$ (categorical/binary) and $X_{2}$ (quantitative) associated with fixed effects and one variable $Z$ (categorical) associated with random effects. However, the same principles apply when $X_{1}$ , $X_{2}$ , and $Z$ are extended to possible sets of variables containing more than one element. Our setting corresponds to the above mentioned variance components model with $p_{1}=p_{2}=1$ and $m=2$ .

3.1. Variable of Interest

Figure 1 shows a discernible location difference with respect to the distribution of the response variable $Y$ in the two groups indicated by the binary variable $X_{1}$ . The Welch two-sample $t$ statistic for the null hypothesis of no difference in group means reads $|t|=6.0751$ implying a highly significant result. A corresponding effect size measure is Cohen’s $d$ which may be computed from R package effectsize, see Ben-Shachar et al. (2020), as $|d|=0.4122$ . From Cohen, values $|d|=0.2$ , $|d|=0.5$ and $|d|=0.8$ indicate a small, medium and large effect, respectively.

3.2. Additional Fixed Effects

From Figure 2 one may conclude, however, that the difference in the two groups may to some extent be explained by the variable $X_{2}$ , since there is a tendency for larger values of $X_{2}$ to come along with larger values of $Y$ and observations from group 1 of variable $X_{1}$ . Therefore one might be interested in the size of the effect of $X_{1}$ when $X_{2}$ is held constant. This can be achieved by considering a regression model with $X_{1}$ and $X_{2}$ as independent variables and deriving the measure $f^{2}$ as explained in (Cohen, 1988, Sect. 9). From package effectsize one gets $f^{2}=0.0017767$ . Here, values $f^{2}=0.02$ , $f^{2}=0.15$ and $f^{2}=0.35$ are supposed to indicate a small, medium and large effect, respectively.

Recently, Groß and Möller (2023) considered a generalized version $d_{\ast}$ of $d$ as an effect size measure for a binary variable $X_{1}$ given further variables. It may be computed from $f^{2}$ as

[TABLE]

where in our analysis $w=1$ is the number of additional independent variables incorporated in the model and $\sigma^{2}\gamma$ is the variance of the regression coefficient for $X_{1}$ . For our data $\gamma=0.0065821$ , yielding $d_{\ast}=0.108$ and thus confirming a less than small effect.

3.3. Additional Fixed and Random Effects

As a next step one may take the categorical variable $Z$ , admitting 15 groups in our data set, into account. From Figure 3 it is seen that there is a tendency for $Z$ groups with larger means of the response variable $Y$ to contain less observations marked as 1 (referring to variable $X_{1}$ ) than $Z$ groups with smaller means of $Y$ . However, since observations from group 1 were earlier seen to reveal on average larger $Y$ values than observations from group 0, holding $Z$ constant is expected to contribute in favor of an effect of $X_{1}$ again.

As noted before, the variable $Z$ is meant to be associated with a random effects vector $\bm{u}$ . The corresponding design matrix $\bm{Z}$ has 15 columns where each row contains 0’s and a single 1, indicating the membership of the observations to the respective $Z$ variable group. There are two variance components, the overall $\sigma^{2}>0$ and the random effects variance denoted by $\sigma_{u}^{2}\geq 0$ here. The model may also be reparameterized via an unknown $k\geq 0$ by setting $\sigma_{u}^{2}=k\sigma^{2}$ . This gives variance-covariance matrix

[TABLE]

and the LMM reduces to the usual fixed effects regression model in case $k=0$ . For $\bm{V}$ from (15) one may compute the effect size $f^{2}$ for $X_{1}$ as a function of $k$ from formula (8) when all variables $X_{1}$ , $X_{2}$ and $Z$ are included in the LMM formulation. Figure 4 shows $f^{2}$ for different choices of $k$ . As expected, the effect size is larger when both, $X_{2}$ and $Z$ are incorporated into the model compared to the case when only $X_{2}$ is employed, corresponding to the choice $k=0$ .

The operational version of $f^{2}$ from (13) can easily be computed from fitting the model by the function lmer from package lme4 as fit <- lmer(Y ~ 1 + X1 + X2 + (1|Z)). The estimated variance components are $\widehat{\sigma}^{2}=393.4455$ and $\sigma_{u}^{2}=180.4234$ giving $\widehat{k}=\sigma_{u}^{2}/\widehat{\sigma}^{2}=0.4586$ as an estimation for $k$ , see the dashed vertical line in Figure 4. Then the vector $\widehat{\beta}$ is obtained from fixef(fit) and $\widehat{\mathop{\operator@font Cov}\nolimits}({\widehat{\bm{\beta}}})$ is obtained from vcov(fit). By using $\bm{R}_{1}=(0,\,1,\,0)$ and $\nu=n-p=997$ , formula 13 yields $f^{2}=0.0946626$ indicating a small but not medium effect size of $X_{1}$ when $X_{2}$ and $Z$ are held constant.

3.4. Coefficient of Determination

Finally, we confirm that $f^{2}$ may also be obtained from formula (9), although there is no actual need for this when formula (13) can be used instead. For this, we define

[TABLE]

which is the proposed $R^{2}$ for linear mixed models by Edwards et al. (2008, Eq. (19)). Here, $F$ is the $F$ statistic from (6) for testing the hypothesis $H_{0}:\bm{R}\bm{\beta}=\bm{0}_{p-1}$ with $\bm{R}=(\bm{0}_{p-1}:\bm{I}_{p-1})$ . The relationship between $F$ and $R_{AB}^{2}$ from (16) may be established similar to Example 4.8 from Seber and Lee (2003). For our data $\bm{R}=(\bm{0}_{2}:\bm{I}_{2})$ , $r=2$ , and $\nu=n-p=997$ . The measure $R_{A}^{2}$ is defined in the same way, but for a reduced model with all variables present except for $X_{1}$ . For this reduced model we have $\bm{R}=(0,\,1)$ , $\bm{r}=(0)$ , $r=1$ , $\nu=n-p_{2}-1=998$ . This results in

[TABLE]

which is nearly the same as the value computed above. The displayed values were obtained from operational versions of $R_{A,B}^{2}$ and $R_{A}^{2}$ computed in the same light as formula (13) by employing the estimated variance covariance matrix of the fixed effects resulting from applying vcov() to the two models in question.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Bates et al. (2015) D. Bates, M. Mächler, B. Bolker, and S. Walker. Fitting linear mixed-effects models using lme 4. Journal of Statistical Software , 67:1–48, 2015. URL https://doi.org/10.18637/jss.v 067.i 01 . · doi ↗
2Ben-Shachar et al. (2020) M. S. Ben-Shachar, D. Lüdecke, and D. Makowski. effectsize: Estimation of effect size indices and standardized parameters. Journal of Open Source Software , 5:2815, 2020. URL https://doi.org/10.21105/joss.02815 . · doi ↗
3Christensen (2019) R. Christensen. Advanced Linear Modeling: Statistical Learning and Dependent Data . Springer Nature, 2019.
4Christensen (2020) R. Christensen. Plane Answers to Complex Questions. Fifth Edition . Springer, 2020.
5Cohen (1988) J. Cohen. Statistical Power Analysis for the Behavioral Sciences. Second Edition . Lawrence Erlbaum Associates, 1988.
6Edwards et al. (2008) L. J. Edwards, K. E. Muller, R. D. Wolfinger, B. F. Qaqish, and O. Schabenberger. An R 2 statistic for fixed effects in the linear mixed model. Statistics in Medicine , 27:6137–6157, 2008.
7Groß (2004) J. Groß. The general Gauss-Markov model with possibly singular dispersion matrix. Statistical Papers , 45:311–336, 2004.
8Groß and Möller (2023) J. Groß and A. Möller. A note on Cohen’s d from a partitioned linear regression model. Journal of Statistical Theory and Practice , 17:22, 2023. URL https://doi.org/10.1007/s 42519-023-00323-w . · doi ↗