Regression approaches for modeling genotype-environment interaction and making predictions into unseen environments
Maksym Hrachov, Hans-Peter Piepho, Niaz Md. Farhat Rahman, Waqas Ahmed Malik

TL;DR
This paper reviews and connects various regression methods used in plant breeding to improve predictions in new environments by incorporating environmental data.
Contribution
The paper introduces a new approach for estimating prediction uncertainty and unifies diverse regression methods under a common framework.
Findings
Environmental covariates improve prediction accuracy in plant breeding.
A new method enhances estimation of prediction variance for genotype-environment interactions.
Various regression approaches are shown to be closely related within a unified model-based framework.
Abstract
Several seemingly distinct regression methods are closely related. Environmental covariates delivered improved prediction, and a new approach improves estimation of prediction variance. In plant breeding and variety testing, there is an increasing interest in making use of environmental information to enhance predictions for new environments. Here, we will review linear mixed models that have been proposed for this purpose. The emphasis will be on predictions and on methods to assess the uncertainty of predictions for new environments. Our point of departure is straight-line regression, which may be extended to multiple environmental covariates and genotype-specific responses. When observable environmental covariates are used, this is also known as factorial regression. Early work along these lines can be traced back to Stringfield & Salter (1934) and Yates & Cochran (1938), who…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1- —http://dx.doi.org/10.13039/501100001659Deutsche Forschungsgemeinschaft
- —Universität Hohenheim (3153)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic and phenotypic traits in livestock
Introduction
In plant breeding and variety testing, there is an increasing interest in making use of environmental information to enhance prediction of varietal performance. Key words associated with methods to implement this predictive objective are envirotyping (Xu 2016) and enviromics (Cooper & Messina 2021; Resende et al. 2021). The purpose of this paper is to show how different models for genotype-environment interaction that make use of environmental covariate (EC) information for prediction into a target population of environments (TPE) are related. While we focus on a set of linear mixed models, other linear as well as machine and deep learning approaches also exist in this field (Hu et al. 2025; Yu et al. 2025; Zou et al. 2025). The models we discuss in this paper can readily accommodate nonlinearities by including quadratic terms for the regressors (Buntaran et al. 2021). Interactions among regressors can also be incorporated, leading to a response surface regression framework (Box & Draper 2007).
The point of departure will be the factorial regression (FR) model (Denis et al. 1997). Subsequently, modeling genotype as a random factor, three different variance-covariance structures can be imposed, leading to three different models, which are denoted here as random FR (RFR), environmental kernel approach (Jarquín et al. 2014) and reduced rank regression (RRR) (Buntaran et al. 2021; Tolhurst et al. 2022). We also make a link between RRR and a recent paper (Piepho & Blancon 2023) that proposed an extension of Finlay-Wilkinson regression (Finlay & Wilkinson 1963; Yates & Cochran 1938). Furthermore, we consider the variance of predictions using EC under four basic scenarios. The framework is illustrated using data from a multi-environment trial (MET) in Bangladesh.
Regression models
Factorial regression with random genotypes as the point of departure
FR model involves a genotype-specific multiple linear regression on EC (Denis et al. 1997; Piepho & Blancon 2023), and can be written as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{ij}={\alpha }_{i}+{\gamma }_{i1}{x}_{j1}+{\gamma }_{i2}{x}_{j2}+...+{\gamma }_{ip}{x}_{jp}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{ij}$$\end{document} is the expected response for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(i=1,...,I\right)$$\end{document} in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} -th environment \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(j=1,...,J\right)$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\alpha }_{i}$$\end{document} is the intercept for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{ik}$$\end{document} is the slope for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document} -th environmental covariate (EC) for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{jk}$$\end{document} is the value of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document} -th covariate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(k=1,...,p\right)$$\end{document} for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} -th environment.
The FR model in Eq. (1) states the expected response \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{ij}$$\end{document} for given values of the EC and does not include a deviation from the regression. Such deviations are certainly needed to model the observed response, which may be denoted as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{ij}$$\end{document} . For making predictions into other environments in a defined TPE, and at the same time assessing the uncertainty of predictions based on a fitted mixed model, it is necessary to model environments as random (Piepho & Williams 2024). This approach is in spirit related to small area estimation techniques detailed in a book by Rao & Molina (2015) which includes an extensive treatment of model-based mean squared error estimation. If environments are random, deviations from the regression are also random. Moreover, deviations of different genotypes from the regression in the same environment are likely to be positively correlated. The simplest way to account for such correlation is to add a random environmental main effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${u}_{j}$$\end{document} . The full model for observed data then becomes
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{ij}={\eta }_{ij}+{u}_{j}+{e}_{ij}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{ij}$$\end{document} is the observed mean for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} -th environment and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${e}_{ij}$$\end{document} is a random residual. The two random effects jointly model the deviation from the regression lines, given by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${d}_{ij}={u}_{j}+{e}_{ij}$$\end{document} (Piepho & Blancon 2023). Note in passing that environments are usually indexed by both years and locations. If this is the case, it is useful to employ a full factorial model for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${u}_{j}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${e}_{ij}$$\end{document} , which, when taking into consideration the factor genotype for the full model, leads to a three-way model (Talbot 1984; Piepho & Williams 2024). This factorization will be considered in Sect. "Contribution of the deviations from regression".
Regarding genotypes, one option is to model them as fixed. This kind of model with fixed genotypes and random environments was considered by Denis et al. (1997) and Piepho et al. (1998). Alternatively, genotype may be modeled as a random factor. This model can be re-written as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\alpha }_{i}={\mu }_{\alpha }+{a}_{i}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{ik}={\mu }_{\gamma k}+{c}_{ik} $$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{ij}=\left({\mu }_{\alpha }+{a}_{i}\right)+\left({\mu }_{\gamma 1}+{c}_{i1}\right){x}_{j1}+\left({\mu }_{\gamma 2}+{c}_{i2}\right){x}_{j2}+...+\left({\mu }_{\gamma p}+{c}_{ip}\right){x}_{jp}$$\end{document}where
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(\begin{array}{c}{a}_{i}\\ {{\boldsymbol{c}}}_{i}\end{array}\right) = N\left[\begin{array}{c}\left(\begin{array}{c}0\\ \boldsymbol{0}_{p}\end{array}\right),\boldsymbol{ }{\boldsymbol{\Sigma}}\end{array}\right]$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{c}}}_{i}={\left({c}_{i1},...,{c}_{ip}\right)}^{T}$$\end{document} , super-scripted \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T$$\end{document} denotes the transpose of a vector or matrix, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boldsymbol{0}_{p}$$\end{document} is a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p$$\end{document} -vector of zeros, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}$$\end{document} is a variance-covariance matrix (common to all genotypes). Rearranging terms and defining \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{x}}}_{j}={\left({x}_{j1},...,{x}_{jp}\right)}^{T}$$\end{document} , this model can be written as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{ij}=f\left({{\boldsymbol{x}}}_{j}\right)+{g}_{i}\left({{\boldsymbol{x}}}_{j}\right)$$\end{document}where the fixed-effects part
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f\left({{\boldsymbol{x}}}_{j}\right)={\mu }_{\alpha }+{\mu }_{\gamma 1}{x}_{j1}+{\mu }_{\gamma 2}{x}_{j2}+...+{\mu }_{\gamma p}{x}_{jp}$$\end{document}corresponds to a mean regression across genotypes, whereas the random part
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${g}_{i}\left({{\boldsymbol{x}}}_{j}\right)={a}_{i}+{c}_{i1}{x}_{j1}+{c}_{i2}{x}_{j2}+...+{c}_{ip}{x}_{jp}$$\end{document}models the random deviation of the regression for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype from the mean regression \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f\left({{\boldsymbol{x}}}_{j}\right)$$\end{document} . An important observation at this point is that the model has a random main effect for genotypes ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${a}_{i}$$\end{document} ). If we include the random environmental main effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${u}_{j}$$\end{document} for the observed response, the model for the response \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{ij}$$\end{document} has a mixed main effect for environments, given by
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varepsilon }_{j}={\mu }_{\gamma 1}{x}_{j1}+{\mu }_{\gamma 2}{x}_{j2}+...+{\mu }_{\gamma p}{x}_{jp}+{u}_{j}$$\end{document}where the random effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${u}_{j}$$\end{document} acts as a random deviation from the mean regression \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f\left({{\boldsymbol{x}}}_{j}\right)$$\end{document} in Eq. (8).
Specification of the variance–covariance structure for \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\eta }_{ij}$$\end{document}ηij
If we collect responses \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{ij}$$\end{document} for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype into a vector \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\eta}}}_{i}={\left({\eta }_{i1},{\eta }_{i2},...,{\eta }_{iJ}\right)}^{T}$$\end{document} ordered by environments, we have
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\eta}}}_{i}=\boldsymbol{1}_{J}\left({\mu }_{\alpha }+{a}_{i}\right)+\mathbf{X}\left({{\boldsymbol{\mu}}}_{\gamma }+{{\boldsymbol{c}}}_{i}\right), $$\end{document}and
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\eta}}}_{i}\sim N\left[\boldsymbol{1}_{J}{\mu }_{\alpha }+\mathbf{X}{{\boldsymbol{\mu}}}_{\gamma },{\boldsymbol{\Omega}}\right]$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J$$\end{document} is the number of environments, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boldsymbol{1}_{J}$$\end{document} is a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J$$\end{document} -vector of ones,
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X} = \left\{{x}_{jk}\right\}=\left(\begin{array}{cccc}{x}_{11}& {x}_{12}& \cdots & {x}_{1p}\\ {x}_{21}& {x}_{22}& \cdots & \vdots \\ \vdots & \vdots & \ddots & \vdots \\ {x}_{J1}& {x}_{J2}& \cdots & {x}_{Jp}\end{array}\right)$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{\gamma }={\left({\mu }_{\gamma 1},{\mu }_{\gamma 2},...,{\mu }_{\gamma p}\right)}^{T},~ \mathrm{and} ~~{\boldsymbol{\Omega}}=\left(\boldsymbol{1}_{J}\vdots \mathbf{X}\right){\boldsymbol{\Sigma}}{\left(\boldsymbol{1}_{J}\vdots \mathbf{X}\right)}^{T}$$\end{document}Note that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\boldsymbol{1}_{J}\vdots \mathbf{X})$$\end{document} is column-wise concatenation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boldsymbol{1}_{J}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X}$$\end{document} . We will assume here that all \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p$$\end{document} EC in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X}$$\end{document} have been mean-centered and scaled to unit variance. There are different possible specifications for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(p+1) \times (p+1)$$\end{document} variance-covariance matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}$$\end{document} in (6), and consequently different forms of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Omega}}$$\end{document} .
(i) In random coefficients regression, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}$$\end{document} is chosen to be unstructured in order to ensure translational invariance, i.e. invariance to linear transformations of the covariates (Buntaran et al. 2021; Longford 1993). We denote this approach in our context as RFR for random factorial regression.
(ii) By comparison, the kernel approach (Jarquín et al. 2014) assumes
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}=\left(\begin{array}{cc}{\sigma }_{\alpha }^{2}& {\mathbf{0}_p^T}\\ {\mathbf{0}_p}& {\mathbf{I}}_{p}{\sigma }_{\gamma }^{2}\end{array}\right)$$\end{document}This model clearly is not translationally invariant, i.e. linear transformations of the columns in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X}$$\end{document} would alter the fit. However, it only has two parameters and is therefore more parsimonious than the RFR model with unstructured \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}$$\end{document} . In fact, the kernel approach can be regarded as the most simplistic reduction of the RFR model. The kernel model can also be motivated by a regularization argument and implies a ridge regression (Ruppert et al. 2003, p.66) on the EC. We further find that
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Omega}}=\boldsymbol{1}_{J}\boldsymbol{1}_{J}^{T}{\sigma }_{\alpha }^{2}+\mathbf{X}{\mathbf{X}}^{T}{\sigma }_{\gamma }^{2}$$\end{document}The matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{K}}_{E}=\mathbf{X}{\mathbf{X}}^{T}$$\end{document} is seen to be the kernel matrix for environments. Note that standardizing the columns of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X}$$\end{document} to zero mean and unit variance makes the kernel approach unique and ensures that each EC has equal influence on the regression, but does not resolve the lack of translational invariance issue. The model can be re-written in scalar form as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{ij}={\mu }_{\alpha }+{\mu }_{\gamma 1}{x}_{j1}+{\mu }_{\gamma 2}{x}_{j2}+...+{\mu }_{\gamma p}{x}_{jp}+{a}_{i}+{w}_{ij}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${a}_{i}\sim N\left(0,{\sigma }_{\alpha }^{2}\right)$$\end{document} is a genotype main effect and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{w}}}_{i}={\left({w}_{i1},{w}_{i2},...,{w}_{iJ}\right)}^{T}\sim N\left(0,{\mathbf{K}}_{E}{\sigma }_{\gamma }^{2}\right)$$\end{document} is the vector of interactions for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype. Note that the model in (17) involves the mean regression \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f\left({{\boldsymbol{x}}}_{j}\right)$$\end{document} . Authors applying the kernel approach often omit this mean regression, implying that the expected value of the regression coefficients over genotypes is zero. This assumption is usually unrealistic. A nonzero expectation can be incorporated into the model either explicitly, by including the fixed mean regression for environmental covariates as described here, or implicitly, by including fixed main effects for environments. It must be borne in mind, however, that fitting the mean regression \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f\left({{\boldsymbol{x}}}_{j}\right)$$\end{document} requires that the number of environments exceeds the number of covariates. The environmental kernel approach is most often applied in cases where this condition does not hold.
(iii) There is an intermediate option between RFR and the kernel approach, which may be termed RRR. For it we may use
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}={\boldsymbol{\Lambda}}{{\boldsymbol{\Lambda}}}^{T}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Lambda}}$$\end{document} is a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(p+1) \times q$$\end{document} matrix of factor loadings for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q$$\end{document} latent factors and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(p+1)$$\end{document} regression terms (one intercept, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p$$\end{document} slopes). The RRR model can also be denoted as a factor-analytic (FA) model with no residual variances. Essentially, this model approximates the unstructured model for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}$$\end{document} in RFR using a reduced rank matrix (Buntaran et al. 2021; Tolhurst et al. 2022). It is worth stressing that the RRR approximation to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}$$\end{document} is also translationally invariant (Tolhurst 2023). Figure 1 in Appendix A illustrates the connection between these three models.
Two-stage approach to fit the RRR
If we use the partition
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{\Lambda} = \begin{pmatrix} \boldsymbol{\lambda}_a^T \\ \mathbf{\Lambda}_{\gamma} \end{pmatrix} = \begin{pmatrix} 1 & \mathbf{0}_q^T \\ \mathbf{0}_p & \mathbf{\Lambda}_{\gamma} \end{pmatrix} \begin{pmatrix} \boldsymbol{\lambda}_a^T \\ \mathbf{I}_q \end{pmatrix}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\lambda}}}_{\alpha }={\left({\lambda }_{\alpha \left(1\right)},...,{\lambda }_{\alpha \left(q\right)}\right)}^{T}$$\end{document} is a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q$$\end{document} -vector of loadings pertaining to the intercept and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Lambda}}}_{\gamma }=\left\{{\lambda }_{\gamma \left(kh\right)}\right\}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(h=1,...,q\right)$$\end{document} is the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p\times q$$\end{document} sub-matrix of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Lambda}}$$\end{document} pertaining to the slopes, then we may represent the model for the random effects \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${a}_{i}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{c}}}_{i}$$\end{document} by
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(\begin{array}{c}{a}_{i}\\ {{\boldsymbol{c}}}_{i}\end{array}\right)={\boldsymbol{\Lambda}}{{\boldsymbol{v}}}_{i}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{v}}}_{i}\sim N\left(\boldsymbol{0}_{q},{\mathbf{I}}_{q}\right)$$\end{document} . Note that this representation implies the reduced rank structure for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Sigma}}$$\end{document} in (18). We find after some re-arrangement that
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${a}_{i}={{\boldsymbol{\lambda}}}_{\alpha }^{T}{{\boldsymbol{v}}}_{i}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{c}}}_{i}={{\boldsymbol{\Lambda}}}_{\gamma }{{\boldsymbol{v}}}_{i}$$\end{document}and
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X}{{\boldsymbol{c}}}_{i}=\mathbf{X}{{\boldsymbol{\Lambda}}}_{\gamma }{{\boldsymbol{v}}}_{i}=\mathbf{Z}{{\boldsymbol{v}}}_{i}$$\end{document}where
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Z}=\left\{{z}_{jh}\right\}=\mathbf{X}{{\boldsymbol{\Lambda}}}_{\gamma} $$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{and} ~~ {z}_{jh}={\lambda}_{\gamma \left(1h\right)}{x}_{j1}+{\lambda}_{\gamma \left(2h\right)}{x}_{j2}+...+{\lambda}_{\gamma \left(ph\right)}{x}_{jp}$$\end{document}is the value of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$h$$\end{document} -th synthetic environmental covariate (SC) for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} -th environment (Piepho & Blancon 2023), and hence
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\eta}}}_{i}=\boldsymbol{1}_{J}{\mu }_{\alpha }+\mathbf{X}{{\boldsymbol{\mu}}}_{\gamma }+\boldsymbol{1}_{J}{a}_{i}+\mathbf{Z}{{\boldsymbol{v}}}_{i}$$\end{document}Also note that
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left(\begin{array}{c}{a}_{i}\\ {{\boldsymbol{v}}}_{i}\end{array}\right)=\left(\begin{array}{cc}{{\boldsymbol{\lambda}}}_{\alpha }^{T}{{\boldsymbol{\lambda}}}_{\alpha }& {{\boldsymbol{\lambda}}}_{\alpha }^{T}\\ {{\boldsymbol{\lambda}}}_{\alpha }& {\mathbf{I}}_{q}\end{array}\right)=\widetilde{{\boldsymbol{\Lambda}}}{\widetilde{{\boldsymbol{\Lambda}}}}^{T}$$\end{document}with
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\boldsymbol{\Lambda}}}}^{T}=\left(\begin{array}{cc}{{\boldsymbol{\lambda}}}_{\alpha }& {\mathbf{I}}_{q}\end{array}\right) ~~ \mathrm{and} ~~{\boldsymbol{\Omega}}=\left(\boldsymbol{1}_{J}\vdots \mathbf{Z}\right)\widetilde{{\boldsymbol{\Lambda}}}{\widetilde{{\boldsymbol{\Lambda}}}}^{T}{\left(\boldsymbol{1}_{J}\vdots \mathbf{Z}\right)}^{T}$$\end{document}This is recognized as a RRR using the synthetic covariates \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Z}$$\end{document} . The important practical implication of this result is that we can use the approach described in Piepho & Blancon (2023) to derive the synthetic covariates, which involves estimating \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Lambda}}}_{\gamma }$$\end{document} and then fit (27), which involves estimating \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\lambda}}}_{\alpha }$$\end{document} . This can then be regarded as a two-stage approach to fit the RRR. To exploit this with a mixed model package, it is necessary to keep in mind how the package imposes constraints on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\boldsymbol{\Lambda}}}}^{T}$$\end{document} . For example, in ASReml-R and SAS, one needs to permute the synthetic covariates \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Z}$$\end{document} and the intercept \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boldsymbol{1}_{J}$$\end{document} to be able to fit the model at the second stage.
Extended Finlay-Wilkinson regression
Piepho and Blancon (2023) suggested to obtain the synthetic covariates in (24) using (11) with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${a}_{i}$$\end{document} taken as fixed and assuming \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var} \left( {\boldsymbol{c}_{i} } \right) = \boldsymbol{\Lambda}_{\gamma } \boldsymbol{\Lambda}_{\gamma }^{T}$$\end{document} . In addition, instead of fitting the mixed-effects mean regression according to Eq. (10), one may consider fitting a simple fixed environmental main effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varepsilon }_{j}$$\end{document} to make sure the coefficients in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Lambda}}}_{\gamma }$$\end{document} are optimized to explain the genotype-environment interaction. From the fitted matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Lambda}}}_{\gamma }$$\end{document} one can then compute the SC using \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Z}=\mathbf{X}{{\boldsymbol{\Lambda}}}_{\gamma }$$\end{document} . Subsequently, the model
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\eta}}}_{i}=\boldsymbol{1}_{J}{\alpha }_{i}+\mathbf{Z}{{\boldsymbol{\beta}}}_{i}$$\end{document}can be fitted, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\alpha }_{i}$$\end{document} is a fixed intercept and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\beta}}}_{i}={\left({\beta }_{i\left(1\right)},...,{\beta }_{i\left(q\right)}\right)}^{T}$$\end{document} are fixed regression coefficients for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype pertaining to the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q$$\end{document} SC in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Z}$$\end{document} , assuming that genotype is a fixed factor and environments are random. Piepho and Blancon (2023) referred to this approach as extended Finlay-Wilkinson (FW) regression. We note that for implementing this approach, it is convenient to impose constraints on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Lambda}}$$\end{document} via \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Lambda}}}_{\gamma }=\left\{{\lambda }_{kh}\right\}$$\end{document} , requiring that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\lambda }_{kh}=0$$\end{document} for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k>h$$\end{document} . At the same time no constraints are imposed on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\lambda}}}_{\alpha }$$\end{document} because \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\alpha }_{i}$$\end{document} is modeled as fixed. Also note that the regression coefficients \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\beta}}}_{i}$$\end{document} in (28) are related to the coefficients \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{v}}}_{i}$$\end{document} in (25).
Adding genetic relationship matrix
The models introduced so far assume independence between genotypes. This can be modified by assuming a kinship matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{K}}_{G}$$\end{document} for genotypes. For example, under the environmental kernel approach for EC, the vector of interaction effects \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{w}}={\left({w}_{1}^{T},{w}_{2}^{T},...,{w}_{I}^{T}\right)}^{T}$$\end{document} , where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I$$\end{document} is the number of genotypes and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${w}_{i}$$\end{document} is defined with the model in Eq. (17), can be assumed to be distributed as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{w}}\sim N\left(\boldsymbol{0}_{IJ},{\mathbf{K}}_{G}\otimes {\mathbf{K}}_{E}{\sigma }_{\gamma }^{2}\right)$$\end{document}Similarly, for the genotype main effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{a}}={\left({a}_{1},{a}_{2},...,{a}_{I}\right)}^{T}$$\end{document} we may assume
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{a}}\sim N\left(\boldsymbol{0}_{I},{\mathbf{K}}_{G}{\sigma }_{\alpha }^{2}\right)$$\end{document}The RFR and RRR approaches can be similarly modified.
Using regression models for predictions into new environments
A single regression term
One important use of regression models involving EC is to make predictions of genotype performances into new environments. The new environment may involve a new location, a new year, or both. Clearly, for giving out recommendations to farmers, the farm’s location is almost invariably an unseen location because it is not part of the trial network based on which the recommendation is made. Similarly, the most relevant scenario is for the new environment to be a projection into a future year or set of years. We assume here that long-term data are available for all EC in all locations of the TPE. Assuming that environment is a random factor and given that environments are indexed by locations and years, a two-way model may be assumed for each EC. For simplicity, we here consider a single EC. Extension to multiple EC and also to multiple SC is straightforward as will be shown later. For a single EC, the model can be written
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{lm}={\mu }_{x}+{L}_{x\left(l\right)}+{Y}_{x\left(m\right)}+{\left(LY\right)}_{x\left(lm\right)}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{lm}$$\end{document} is the value of the EC in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l$$\end{document} -th location in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} -th year, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mu }_{x}$$\end{document} is an intercept, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{x\left(l\right)}\sim N\left(0,{\sigma }_{x\left(L\right)}^{2}\right)$$\end{document} is the random main effect for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l$$\end{document} -th location, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Y}_{x\left(m\right)}\sim N\left(0,{\sigma }_{x\left(Y\right)}^{2}\right)$$\end{document} is the random main effect for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} -th year, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left(LY\right)}_{x\left(lm\right)}\sim N\left(0,{\sigma }_{x\left(LY\right)}^{2}\right)$$\end{document} is the random location-year interaction for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l$$\end{document} -th location and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} -th year.
When making predictions of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{i}$$\end{document} , we need to consider that, depending on the prediction scenario, the exact value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} may not be known with certainty for the targeted environment, but only a long-term mean of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} , either for a specific location from the TPE that is the target of prediction, of for the whole of the TPE. Also, it may be possible that we are specifically interested in the prediction at a mean value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} , e.g. for a specific location or for the whole TPE. Alternatively, a prediction may be needed for a future year, in which case uncertainty regarding the future value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} may come into play. In what follows, four different scenarios (cases) are distinguished. To set the stage, we will initially consider the regression term \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}x$$\end{document} for a single EC and its contribution to the prediction, as well as to the uncertainty of the prediction. Extension to the full model, including the intercept, and multiple EC will be considered later.
In general, we will need to use a value to plug in for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} . This value will be an expected value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} and will be denoted as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi$$\end{document} , so in general the term \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}\xi$$\end{document} is used to predict the mean performance. There may also be uncertainty associated with the value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} , and this can be assessed via a variance of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} , which we will denote as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{x}^{2}$$\end{document} , and which can be partitioned as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{x}^{2} ={\sigma }_{x\left(L\right)}^{2}+ {\sigma }_{x\left(Y\right)}^{2}+{\sigma }_{x\left(LY\right)}^{2}$$\end{document} . To assess the contribution \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{x(i)}$$\end{document} to the overall prediction variance (considered in detail in Sect. "Estimating the prediction variance"), we will initially assume that both \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi$$\end{document} are known, and hence the contribution to the prediction variance for the response is \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\phi }_{x(i)}=\gamma }_{i}^{2}{\sigma }_{x}^{2}$$\end{document} . In subsequent sections, we will account the additional uncertainty arising from the fact that both \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi$$\end{document} in the product \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}\xi$$\end{document} need to be estimated.
In what follows, we will consider four different cases as regards the target of prediction. The four cases are summarized in Table 1. Next, the four cases will be described and assessed in detail.Table 1. Four cases as regards the target of predictionCaseTarget of prediction1Long-term mean in the TPE2A new year at the mean of the TPE3Long-term mean at new location (e.g., a farm)4A new year at a new location (e.g., a farm)
Case 1: Assume that the objective is to obtain an estimate of the long-term mean in the TPE. In this case, we evaluate the fitted regression at the unconditional expectation of the EC in the TPE, given by
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi =E\left(x\right)={\mu }_{x}$$\end{document}i.e. the regression term is \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}{\mu }_{x}$$\end{document} . As this is the prediction at a long-term mean, there is no uncertainty associated with the value we use for the EC, except that the mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mu }_{x}$$\end{document} needs to be estimated, hence \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{x(i)}=0$$\end{document} .
Case 2: Next, consider the case where prediction for a new year at the mean of the TPE is needed. In this case, we would want to replace \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} by the mean in the TPE in the new year \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${m}_{0}$$\end{document} , given by
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\xi }_{{m}_{0}}=E\left(x|{m}_{0}\right)={\mu }_{x}+{Y}_{x\left({m}_{0}\right)}$$\end{document}If the covariate is only observed next year, however, we can merely use the long-term mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi$$\end{document} of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} in the TPE, i.e. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi ={\mu }_{x}$$\end{document} , which deviates from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\xi }_{{m}_{0}}$$\end{document} by the amount \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Y}_{x\left({m}_{0}\right)}$$\end{document} . This deviation is also unknown at the time the prediction is needed, but if we have historical data on the EC, we can estimate the variance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var} \left( {Y_{{x\left( {m_{0} } \right)}} } \right) = \sigma _{{x\left( Y \right)}}^{2}$$\end{document} , and hence \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{x(i)}={\gamma }_{i}^{2}{\sigma }_{x\left(Y\right)}^{2}$$\end{document} .
Case 3: If we want to estimate the long-term mean at new location \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${l}_{0}$$\end{document} , e.g. a farm, we use the long-term mean of the EC at the new location:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\xi }_{{l}_{0}}=E\left(x|{l}_{0}\right)={\mu }_{x}+{L}_{x\left({l}_{0}\right)}$$\end{document}Note that in this case, the model in (31) will have a fixed location effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{x\left(l\right)}$$\end{document} because historical data is available for the target location. As we are estimating a long-term mean, we have \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{x(i)}=0$$\end{document} . Note that estimating \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\xi }_{{l}_{0}}$$\end{document} requires long-term EC data to be available for the new location.
Case 4: Finally, assume that the objective is prediction of a mean for a new location in a new year. As in the previous case, model in (31) will have fixed location effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{x\left(l\right)}$$\end{document} and we use the long-term mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\xi }_{{l}_{0}}$$\end{document} of the EC at the new location because the values of the EC at the new location for the future year will be unavailable when the prediction is needed. However, this time \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{x(i)}={\gamma }_{i}^{2}\left({\sigma }_{x\left(Y\right)}^{2}+{\sigma }_{x\left(LY\right)}^{2}\right)$$\end{document} .
In many studies assessing the predictive benefit of using EC, a leave-one-environment-out cross-validation (CV) strategy is employed by which the left-out environment is used to validate the predictions obtained from a model fitted to the remaining environments. In this kind of CV, the observed EC values for the left-out environment are plugged into the fitted model to obtain a prediction. This means that these predictions for the left out \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$lm$$\end{document} -th environment use \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi ={x}_{lm}$$\end{document} . It should be pointed out that this prediction scenario does not correspond to any of the four cases defined above. Using CV with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi ={x}_{lm}$$\end{document} certainly can inform about the predictive potential of EC, but we contend that it does not represent a realistic prediction scenario occurring in practice. A useful modification of leave-one-environment-out CV with multi-year and multi-location data mimic the Case 4 by using \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi ={\xi }_{{l}_{0}}$$\end{document} in Eq. (32) for predictions into the left-out environment, involving the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${l}_{0}$$\end{document} -th location and a new year \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${m}_{0}$$\end{document} .
Estimating the prediction variance
The overall prediction variance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{i}$$\end{document} associated with the regression term \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}x$$\end{document} has two components. The first arises from the fact that both \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi$$\end{document} in the product \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}\xi$$\end{document} need to be estimated. The second corresponds to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{x(i)}={\gamma }_{i}^{2}{\sigma }_{x}^{2}$$\end{document} , i.e. to the fact that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} is replaced by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi$$\end{document} , from which it may deviate. First, we consider estimation of the contribution \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}\xi$$\end{document} to the predicted mean. We generally assume that EC and yield data, conditional on the EC, are independent. The estimator, being a product of two independent random variables \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{\gamma }}_{i}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\xi }$$\end{document} has variance (Brown & Alexander 1991; Goodman 1960)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({\hat{\gamma }}_{i}\hat{\xi }\right)={\gamma }_{i}^{2}\mathrm{var}\left(\hat{\xi }\right)+{\xi }^{2}\mathrm{var}\left({\hat{\gamma }}_{i}\right)+\mathrm{var}\left({\hat{\gamma }}_{i}\right)\mathrm{var}\left(\hat{\xi }\right)$$\end{document}Note that the naïve plug-in estimator of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}^{2}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{\gamma }}_{i}^{2}$$\end{document} , is biased. To see this, assume that the estimator \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{\gamma }}_{i}$$\end{document} is unbiased, i.e. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E\left({\hat{\gamma }}_{i}\right)={\gamma }_{i}$$\end{document} . Then from the definition of the variance (Rice 1995, p.124)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({\hat{\gamma }}_{i}\right)=E\left({\hat{\gamma }}_{i}^{2}\right)-{\left[E\left({\hat{\gamma }}_{i}\right)\right]}^{2}=E\left({\hat{\gamma }}_{i}^{2}\right)-{\gamma }_{i}^{2}$$\end{document}It emerges from Eq. (36) that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E\left( {\hat{\gamma }_{i}^{2} } \right) = \gamma _{i}^{2} + \mathrm{var} \left( {\hat{\gamma }_{i} } \right)$$\end{document} , establishing the bias. Hence, we may estimate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}^{2}$$\end{document} by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\tilde{\gamma }}_{i}^{2}={\hat{\gamma }}_{i}^{2}-var({\hat{\gamma }}_{i}^{2})$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\xi }^{2}$$\end{document} by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tilde{\xi }^{2} = \hat{\xi }^{2} - \mathrm{var} \left( {\hat{\xi }^{2} } \right)$$\end{document} , where for simplicity we make no distinction in notation between \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left(\hat{\xi }\right)$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({\hat{\gamma }}_{i}\right)$$\end{document} and their estimators that need to be used in practice. Hence, the estimator of (35) is
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm{est.var}} \left( {\hat{\gamma }_{i} \hat{\xi }} \right) = \hat{\gamma }_{i}^{2} {\mathrm{var}} \left( {\hat{\xi }} \right) + \hat{\xi }^{2} {\mathrm{var}} \left( {\hat{\gamma }_{i} } \right) - {\mathrm{var}} \left( {\hat{\gamma }_{i} } \right){\mathrm{var}} \left( {\hat{\xi }} \right)$$\end{document}Next, consider the estimation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\phi }_{x(i)}=\gamma }_{i}^{2}{\sigma }_{x}^{2}$$\end{document} , where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{x}^{2}$$\end{document} represents the variance of a random effect in model (31), or a sum of more than one such variance component, depending on the case considered in Sect. "A single regression term" (Cases 1 to 4). The naïve plug-in estimator \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{\gamma }}_{i}^{2}{\hat{\sigma }}_{x}^{2}$$\end{document} is biased, and bias may be reduced by using the estimator
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{\phi }}_{x(i)}=\left[{\hat{\gamma }}_{i}^{2}-\mathrm{var}\left({\hat{\gamma }}_{i}\right)\right]{\hat{\sigma }}_{x}^{2}$$\end{document}The overall prediction variance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{i}$$\end{document} associated with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{i}x$$\end{document} may be estimated by adding (37) and (38).
Extension to multiple EC and inclusion of the intercept
The linear predictor in Eq. (1) can be written as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{i}={{\boldsymbol{\gamma}}}_{i}^{{\prime}T}{{\boldsymbol{x}}}^{\prime}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{x}}}^{\prime}={\left(1,{x}_{1},...,{x}_{p}\right)}^{T}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\gamma}}}_{i}^{\prime}={\left({\alpha }_{i},{\gamma }_{i1},...,{\gamma }_{ip}\right)}^{T}$$\end{document} . When it comes to prediction, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{x}}}^{\prime}$$\end{document} will be replaced by its conditional expectation, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\xi}}}^{\prime}={\left(1,{\xi }_{1},...,{\xi }_{p}\right)}^{T}$$\end{document} , depending on the particular prediction scenario (Cases 1 to 4; see Sect. "A single regression term"), leading to the predictor
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{i}={{\boldsymbol{\gamma}}}_{i}^{{\prime}T}{{\boldsymbol{\xi}}}^{\prime}$$\end{document}Prediction using SC can be investigated analogously by simply replacing \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{x}}}^{\prime}$$\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{z}}}^{\prime}={\left(1,{z}_{1},...,{z}_{q}\right)}^{T}$$\end{document} but for the sake of brevity this will not be considered here explicitly. The conditional variance of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{x}}}^{\prime}$$\end{document} for given \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\xi}}}^{\prime}$$\end{document} will be denoted as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} , and the associated prediction variance will be
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{{\boldsymbol{x}}^{\prime}(i)}={{\boldsymbol{\gamma}}}_{i}^{{\prime}T}{{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}}{{\boldsymbol{\gamma}}}_{i}^{\prime}$$\end{document}To assess the prediction variance, we use a multivariate extension of the two-way model (31) for the covariate vector \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{x}}}^{\prime}$$\end{document} :
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{x}}}_{lm}^{\prime}={{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}+{{\boldsymbol{L}}}_{{\boldsymbol{x}}^{\prime}\left(l\right)}+{{\boldsymbol{Y}}}_{{\boldsymbol{x}}^{\prime}\left(m\right)}+{\left({\boldsymbol{L}}{\boldsymbol{Y}}\right)}_{{\boldsymbol{x}}^{\prime}\left(lm\right)}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}={\left(1,{\mu }_{{x}_{1}},...,{\mu }_{{x}_{p}}\right)}^{T}$$\end{document} is the overall mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\boldsymbol{x}}}^{\prime}}_{lm}$$\end{document} and the random-effect vectors are similarly defined with variances \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({{\boldsymbol{L}}}_{{\boldsymbol{x}}^{\prime}\left(l\right)}\right)={{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(L\right)}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({{\boldsymbol{Y}}}_{{\boldsymbol{x}}^{\prime}\left(m\right)}\right)={{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(Y\right)}$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({\left({\boldsymbol{L}}{\boldsymbol{Y}}\right)}_{{\boldsymbol{x}}^{\prime}\left(lm\right)}\right)={{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(LY\right)}$$\end{document} . Note that in Cases 3 and 4 this model will feature fixed location effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{L}}}_{{\boldsymbol{x}}^{\prime}\left(l\right)}$$\end{document} instead of random, because we focus on a specific location. We note that the first row and column of these three variance-covariance matrices, which correspond to the intercept, have all entries equal to zero. Moreover, covariates that do not change value over years have corresponding zero entries in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(Y\right)}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(LY\right)}$$\end{document} . Hence, all three variance-covariance matrices are singular. With these definitions, we can now consider the four cases. The explicit expressions for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\xi}}}^{\prime}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{{\boldsymbol{x}}^{\prime}(i)}$$\end{document} are given in Table 2.Table 2. Explicit expressions for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\xi}}}^{\prime}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{{\boldsymbol{x}}^{\prime}(i)}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{R}$$\end{document} for the four scenarios (Case 1 to 4) when predicting individual genotypes. Note: contribution of deviations from regression (denoted \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{R})$$\end{document} and variances that compose it and pertain to years ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y$$\end{document} ), locations ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L$$\end{document} ), genotypes ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document} ), or interactions ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} ) are discussed in more detail in Sect. "Contribution of the deviations from regression".Case \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\xi}}}^{\prime}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\phi }_{{\boldsymbol{x}}^{\prime}(i)}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{R}$$\end{document} 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} 0002 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(Y\right)}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\gamma}}}_{i}^{{\prime}T}{{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(Y\right)}{{\boldsymbol{\gamma}}}_{i}^{\prime}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{Y}^{2}+{\sigma }_{\alpha Y}^{2}$$\end{document} 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}+{{\boldsymbol{L}}}_{{\boldsymbol{x}}^{\prime}\left({l}_{0}\right)}$$\end{document} 00 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{L}^{2}+{\sigma }_{\alpha L}^{2}$$\end{document} 4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}+{{\boldsymbol{L}}}_{{\boldsymbol{x}}^{\prime}\left({l}_{0}\right)}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(Y\right)}+{{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(LY\right)}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\gamma}}}_{i}^{{\prime}T}\left({{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(Y\right)}+{{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(LY\right)}\right){{\boldsymbol{\gamma}}}_{i}^{\prime}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{L}^{2}+{\sigma }_{\alpha L}^{2}+{\sigma }_{Y}^{2}+{\sigma }_{\alpha Y}^{2}+{\sigma }_{LY}^{2}+{\sigma }_{\alpha LY}^{2}$$\end{document}
Estimating the overall prediction variance
Again, the overall prediction variance has two components, one arising from the fact that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{i}={{\boldsymbol{\gamma}}}_{i}^{{\prime}T}{{\boldsymbol{\xi}}}^{\prime}$$\end{document} needs to be estimated and the second corresponding to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} . First, we consider estimation of the predicted mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\eta }_{i}={{\boldsymbol{\gamma}}}_{i}^{{\prime}T}{{\boldsymbol{\xi}}}^{\prime}$$\end{document} . We generally assume that EC and yield data are independent. The estimator, being a product of two random vectors \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{{\boldsymbol{\xi}}}}^{\prime}$$\end{document} , has total variance
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{{\prime}T}{\hat{{\boldsymbol{\xi}}}}^{\prime}\right)={{\boldsymbol{\gamma}}}_{i}^{{\prime}T}\mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right){{\boldsymbol{\gamma}}}_{i}^{\prime}+{{{\boldsymbol{\xi}}}^{\prime}}^{T}\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}\right){{\boldsymbol{\xi}}}^{\prime}+\mathrm{trace}\left[\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}\right)\mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right)\right]$$\end{document}This variance may be regarded as straightforward extension of the result for a product of two scalar random variables (Brown & Alexander 1991; Goodman 1960) (see Appendix B for a derivation). We may estimate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\gamma}}}_{i}^{{\prime}T}\mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right){{\boldsymbol{\gamma}}}_{i}^{\prime}$$\end{document} by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{{\boldsymbol{\gamma}}}}_{i}^{{\prime}T}\mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right) {\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}- \mathrm{trace}\left[\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}\right) \mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right)\right]$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\boldsymbol{\xi}}}^{\prime}}^{T}\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}\right){{\boldsymbol{\xi}}}^{\prime}$$\end{document} by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\hat{{\boldsymbol{\xi}}}}}^{{\prime}{T}}\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}\right){\hat{{\boldsymbol{\xi}}}}^{\prime}- \mathrm{trace}\left[\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}\right) \mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right)\right]$$\end{document} , where for simplicity of notation we make no distinction between \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right)$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}\right)$$\end{document} and their estimators. Hence, the estimator of (43) is
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{est.var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{{\prime}T}{\hat{{\boldsymbol{\xi}}}}^{\prime}\right)={\hat{{\boldsymbol{\gamma}}}}_{i}^{{\prime}T}\mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right){\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}+{{{\boldsymbol{\xi}}}^{\prime}}^{T}\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}\right){{\boldsymbol{\xi}}}^{\prime}-\mathrm{trace}\left[\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}\right)\mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right)\right]$$\end{document}The estimates of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({\hat{{\boldsymbol{\xi}}}}^{\prime}\right)$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{var}\left({{\hat{{\boldsymbol{\gamma}}}}^{\prime}}_{i}\right)$$\end{document} needed in (43) can be obtained from the inverse of the coefficient matrix of the mixed model equations after completion of the residual maximum likelihood estimation of the variance components involved (Searle et al. 1992, p. 276).
Next, consider the estimation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\gamma}}}_{i}^{{\prime}T}{{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}}{{\boldsymbol{\gamma}}}_{i}^{\prime}$$\end{document} in (41). The naïve plug-in estimator \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{\phi}}_{{\boldsymbol{x}}^{\prime}(i)}={\hat{{\boldsymbol{\gamma}}}}_{i}^{{\prime}T}{\hat{{\boldsymbol{\Sigma}}}}_{{\boldsymbol{x}}^{\prime}}{\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}$$\end{document} is biased, and bias may be reduced by using the estimator
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{\phi }}_{{\boldsymbol{x}}^{\prime}(i)}={\hat{{\boldsymbol{\gamma}}}}_{i}^{{\prime}T}{\hat{{\boldsymbol{\Sigma}}}}_{{\boldsymbol{x}}^{\prime}}{\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime} - \mathrm{trace}\left[\mathrm{var}\left({\hat{{\boldsymbol{\gamma}}}}_{i}^{\prime}\right){\hat{{\boldsymbol{\Sigma}}}}_{{\boldsymbol{x}}^{\prime}}\right]$$\end{document}Contribution of the deviations from regression
In all four cases, the residual terms \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${u}_{j}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${e}_{ij}$$\end{document} from the model in (2) are partitioned by year and location, i.e. we will use
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${u}_{lm}={L}_{l}+{Y}_{m}+{\left(LY\right)}_{lm} ~\mathrm{and}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_{ilm} = \left( {\alpha L} \right)_{il} + \left( {\alpha Y} \right)_{im} + \left( {\alpha LY} \right)_{ilm}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document} denote the factors location, year and genotype, and subscripts \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} index locations and years. All effects in (46) and (47) are independently distributed with constant variance, i.e. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{l}\sim N\left(0,{\sigma }_{L}^{2}\right)$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Y}_{m}\sim N\left(0,{\sigma }_{Y}^{2}\right)$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left(LY\right)}_{lm}\sim N\left(0,{\sigma }_{LY}^{2}\right)$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left(\alpha L\right)}_{il}\sim N\left(0,{\sigma }_{\alpha L}^{2}\right)$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left(\alpha Y\right)}_{im}\sim N\left(0,{\sigma }_{\alpha Y}^{2}\right)$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left(\alpha LY\right)}_{ilm}\sim N\left(0,{\sigma }_{\alpha LY}^{2}\right)$$\end{document} . In each case, a subset of these variances will contribute to the overall uncertainty of the prediction. The variance of this contribution will be denoted as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{R}$$\end{document} . The total prediction variance is
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon_{i} = {\mathrm{var}} \left( {\hat{\boldsymbol\gamma }_{i}^{\prime T} \hat{\boldsymbol\xi }^{\prime } } \right) + {\phi }_{\boldsymbol{x}^{\prime} \left( i \right)} + \upsilon_{R}$$\end{document}The total prediction variance can be estimated by plugging in corrected estimators (44) and (45) in (48). The explicit expressions of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{R}$$\end{document} for the four cases are given in Table 2. Note that in Case 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{R}={\sigma }_{L}^{2}+{\sigma }_{\alpha L}^{2}$$\end{document} , because the random effects \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{{l}_{0}}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left(\alpha L\right)}_{i{l}_{0}}$$\end{document} are unknown for a new location. Also, in Case 4 we have \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{R}={\sigma }_{L}^{2}+{\sigma }_{\alpha L}^{2}+{\sigma }_{Y}^{2}+{\sigma }_{\alpha Y}^{2}+{\sigma }_{LY}^{2}+{\sigma }_{\alpha LY}^{2}$$\end{document} , because all location- and year-related effects are unknown for a new location and year. If prediction for several new locations in a new year is required, then the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{i}$$\end{document} is computed for every new location separately.
Pairwise differences
To compute \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon$$\end{document} for the pairwise difference of two genotypes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${i}^{\prime}$$\end{document} , we replace \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\gamma}}}_{i}^{\prime}$$\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\delta}}}_{ii^{\prime}}^{\prime}={{\boldsymbol{\gamma}}}_{i}^{\prime}-{{\boldsymbol{\gamma}}}_{i^{\prime}}^{\prime}$$\end{document} . In the residual prediction variance for a difference \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\upsilon }_{R}$$\end{document} , we may drop \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{L}^{2}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{Y}^{2}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{LY}^{2}$$\end{document} , because the corresponding random effects drop out in the pairwise difference. Applying Eqs. (43), (44), and (47) to pairs replacing \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\gamma}}}_{i}^{\prime}$$\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\delta}}}_{ii^{\prime}}^{\prime}={{\boldsymbol{\gamma}}}_{i}^{\prime}-{{\boldsymbol{\gamma}}}_{i^{\prime}}^{\prime}$$\end{document} , we find after some algebra (see Appendix C) the average total prediction variance of a difference
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{{\overline{\upsilon } }}_{\delta } = \mathrm{trace}\;\left\{ {\left[ {\bf{P} \otimes \left( {{\mathrm{var}} \left( \boldsymbol{{\hat{\xi }}}^{\prime } \right) + \hat{\boldsymbol{\Sigma} }_{x^\prime } } \right)} \right]\hat{\boldsymbol{\gamma }}^{\prime } \hat{\boldsymbol{\gamma} }^{\prime T} } \right\} + \; \mathrm{trace}\left[ {{\mathrm{var}} \left( {\hat{\boldsymbol{\gamma} }^{\prime } } \right)\left( {\bf{P} \otimes \boldsymbol{\hat{\xi }}^{\prime } \boldsymbol{\hat{\xi }}^{\prime T} } \right)} \right] - \mathrm{trace}\;\left\{ {{\mathrm{var}} \left( {\hat{\boldsymbol{\gamma }}^{\prime } } \right)\left[ {\bf{P} \otimes \left( {{\mathrm{var}} \left( \boldsymbol{{\hat{\xi }}}^{\prime } \right) + \hat{\boldsymbol{\Sigma} }_{x^{\prime} } } \right)} \right]} \right\} + \;\hat{{\overline{\upsilon } }}_{R} $$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\gamma}}}^{\prime}={\left({\boldsymbol{\gamma}^{\prime}_{1}}^{T},...,{\boldsymbol{\gamma}^{\prime}_{I}}^{T}\right)}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{P}=2{\left(I-1\right)}^{-1}\left[{\mathbf{I}}_{I}-{\mathbf{K}}_{I}\right]$$\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{K}}_{I}={I}^{-1}{\mathbf{1}}_{I}{\mathbf{1}}_{I}^{T}$$\end{document} , and the average residual prediction variance of a difference \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{\upsilon }_{R}$$\end{document} is defined in Table 3.Table 3. Explicit expressions for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\xi}}}^{\prime}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{\upsilon }_{R}$$\end{document} for the four scenarios (Case 1 to 4) when predicting genotype differences and averaging the uncertainty across pairsCase \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boldsymbol{{\xi }}^{\prime}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boldsymbol{{\Sigma }}_{x^{\prime}}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{\upsilon }_{R}$$\end{document} 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} 002 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(Y\right)}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2{\sigma }_{\alpha Y}^{2}$$\end{document} 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}+{{\boldsymbol{L}}}_{{\boldsymbol{x}}^{\prime}\left({l}_{0}\right)}$$\end{document} 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2{\sigma }_{\alpha L}^{2}$$\end{document} 4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\mu}}}_{{\boldsymbol{x}}^{\prime}}+{{\boldsymbol{L}}}_{{\boldsymbol{x}}^{\prime}\left({l}_{0}\right)}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(Y\right)}+{{\boldsymbol{\Sigma}}}_{{\boldsymbol{x}}^{\prime}\left(LY\right)}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2\left({\sigma }_{\alpha L}^{2}+{\sigma }_{\alpha Y}^{2}+{\sigma }_{\alpha LY}^{2}\right)$$\end{document}
Materials and methods
Datasets
Long-term multi-environment rice stability trials, described by Rahman et al. (2023) and provided by the Bangladesh Rice Research Institute (BRRI), are used to demonstrate models and methods. These trials include registered varieties from two distinct breeding programs: irrigated winter (dry season) rice and rainfed summer (monsoon season) rice. The two datasets are referred to as winter rice and summer rice, respectively, and were analyzed separately. They contain yield observations (t/ha) obtained from randomized complete block design with three blocks, replicated across 8 (summer rice) and 9 (winter rice) locations per year, between 2001 and 2022. Over time, the number of summer rice varieties increased from 16 to 45 and the number of winter rice varieties increased from 18 to 42, as new varieties were added while retaining older ones. Varieties tested in fewer than two years were excluded.
Covariates
Weather data were sourced from the AgERA5 database (Boogaard et al. 2020) using the cdsapi application programming interface in Python 3.11.7 (Van Rossum & Drake 2009), and decoded using the ag5Tools library (Brown et al. 2023) in R (R Core Team 2021). Weather covariates were aggregated by growing season. A total of 8 covariates was used for each dataset. A full description of selected covariates is provided in Appendix D.
Models
Both rice datasets were analyzed using a two-stage approach. In the first stage, the genotype means and associated variances were estimated by fitting a model (Rahman et al. 2023):
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y_{ir} = \mu + b_{r} + g_{i} + e_{ir}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y_{ir}$$\end{document} is the observation of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r$$\end{document} *-*th block, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document} is the fixed intercept, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b_{r}$$\end{document} is the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r$$\end{document} -th fixed block effect, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$g_{i}$$\end{document} is the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th fixed genotype effect, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_{ir}$$\end{document} is the independent and identically distributed error associated with the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r$$\end{document} -th block. Inverse variances of adjusted genotype means were used in the second stage of the analysis as diagonal weights. Comparison of models described in Sect. "Regression models" is done based on their performance in the second stage. All models have fixed overall intercept and fixed covariate slopes, and all the other effects are treated as random. The main genotypic effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document} and terms \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$u_{j}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_{ij}$$\end{document} , partitioned by year and location as in (46) and (47), appear in every model unchanged. Therefore, the models differ solely in how covariates are incorporated:
- Baseline model without genotype-specific covariate response.
- Environmental kernel model with the kernel matrix derived from observed covariates on a per-environment basis (15).
- RRR with the reduced rank variance-covariance structure of rank one (RRR1) or two (RRR2) assigned to genotype-specific covariate slopes and the genotypic main effect (18).
- RFR with the unstructured variance-covariance structure assigned to genotype-specific covariate slopes and the genotypic main effect.
- Unstructured regression with synthetic covariates (FW-US) fitted similar to RFR, except that synthetic covariates were used instead of observed. Synthetic covariates were derived as in (24) using the method similar to one described for the Extended Finlay-Wilkinson regression in Piepho & Blancon (2023, Sect. 5.2, Eq. 18), except that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} effects were included and fitted as random, as it improved the quality of synthetic covariates. Only one (FW1-US) and two (FW2-US) synthetic covariates were obtained, as it was shown sufficient in related literature (Piepho & Blancon 2023; Tadese et al. 2024).
A side-by-side illustration of these models in simplified notation is available in Appendix E.
In the case of the Kernel model, the number of covariates can be much larger than the number of the environments, and thus the fixed covariate slope cannot be fitted. Because this possibility exists, we investigated it for every model and termed such models “without the main EC effect” in tables and the text.
All models were fitted in ASReml-R 4.2 (Butler et al. 2023) in R programming language version 4.5.1 enhanced by the oneAPI Intel Math Kernel Library version 2024.0.
Model performance evaluation
The models were evaluated through a comparison of the model fit with the entirety of the data, and through two CV scenarios.
The model fit was evaluated based on several criteria, including the number of parameters (accounting for those associated with synthetic covariates, if applicable), the log-likelihood (LogLik) of the fitted model, the Akaike Information Criterion (AIC), the variance components and their percentage change relative to the baseline model. The AIC values were calculated using a method proposed by Verbyla (2019) implemented in the infoCriteria function from the asremlPlus package (Brien 2024), and then adjusted for the number of parameters involved in synthetic covariates:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{AIC} = - 2*LogLik + 2*\left( {VarP + FixedP + SP} \right)$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$LogLik$$\end{document} is the full likelihood, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$VarP$$\end{document} is the number of estimated variance parameters, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$FixedP$$\end{document} is the number of estimated fixed parameters, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$SP$$\end{document} is the number of parameters involved in synthetic covariates. The \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$SP$$\end{document} of FW1-US was 8, while for FW2-US, the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$SP$$\end{document} was 15. The \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$SP$$\end{document} of other models was 0. Parameters fixed by ASReml-R during model fitting were nevertheless included in the total number of parameters. Parameters that were fixed by design were not included in the AIC (i.e., the residual variance parameter fixed at 1 to correctly provide weights from the first stage analysis, and the first loading of the second-order reduced rank variance-covariance structure fixed at 0).
The CV was used to evaluate the performance of models using various criteria: Pearson's correlation coefficient (PCC) between predicted and observed values, mean squared prediction error (MSPE), mean squared error of predicted differences (MSEPD), mean variance of predictions (MVP), and mean variance of predicted differences (VPD).
MSPE is defined as:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm{MSPE}}_{j} = \frac{{\mathop \sum \nolimits_{i = 1}^{I} \left( {y_{ij} - \hat{y}_{ij} } \right)^{2} }}{I}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{ij}$$\end{document} is the observed phenotypic value of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} -th genotype in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} -th environment, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{y}}_{ij}$$\end{document} is the respective predicted phenotypic value, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I$$\end{document} is the total number of genotypes. Environment-specific means are then aggregated as means or medians.
MSEPD is defined as (Piepho 1998):
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{MSEPD}_{j} = \frac{{2\mathop \sum \nolimits_{i = 1}^{I} \left( {f_{ij} - \overline{f}_{ \cdot j} } \right)^{2} }}{{\left( {I - 1} \right)}}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${f}_{ij}={y}_{ij}-{\hat{y}}_{ij}$$\end{document} similar to MSPE, and the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{f}_{ \cdot j}$$\end{document} is the mean of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${f}_{ij}$$\end{document} in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} -th environment. Environment-specific means are then aggregated as means or medians.
Variance of prediction and VPD were calculated as in (48) and (49), respectively. The variance of prediction was averaged across genotypes within an environment to get the MVP. Such environment-specific values of MVP and VPD were aggregated as means or medians.
The first CV scenario is a common leave-one-environment-out (LOEO) approach, where known covariate values are used to predict the phenotypic response in the left-out environment. The second scenario mimics prediction in an unseen environment by leaving an entire year and location out (LYLO). In this case, prediction is performed for the left-out location in the left-out year using the mean covariate values from available years for that location. In LYLO CV, the effects \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$YL$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} are not available for prediction. In contrast, under the LOEO scenario, only \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$YL$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} are not available. Covariate values in LYLO CV can be imputed as a simple mean over years in the training set or calculated using a multivariate mixed model as in (42), which is numerically identical in this setting. This model is required to estimate the variances and covariances of EC for the MVP and VPD.
Critical points in model fitting
Covariates were centered and scaled to unit variance only once in LOEO CV, because the table of covariates is the same for every CV split there. In LYLO CV we repeated centering and scaling in every split.
To compute the prediction variance using the method suggested in this paper, we need to extract the so-called C-inverse matrix, which contains the variances and covariances of genotypic intercepts and genotype-specific covariate slopes. This matrix is only partially available in ASReml 4.2. Therefore, the approach described by Henderson (1984, Chapter 5) was implemented for both singular and non-singular G matrices. A script illustrating the comparison of the two matrices is available on GitHub.
RRR1, RRR2, and RFR models sometimes had issues with convergence, therefore we precomputed each model with all the data and used the variance component estimates as starting values when there were issues. The model was forced to run additional iterations in case ASReml-R reported convergence, but the LogLik continued to visibly increase. The model fitting was stopped if the LogLik suddenly increased to an unrealistic value or started to decrease.
Running FW1-US and FW2-US models involved two steps: one to extract synthetic covariates, and one to fit them. In both CV scenarios, extraction of synthetic covariates was done in every CV split.
Results
The results from the winter and summer rice datasets are largely consistent, although winter rice was irrigated, which likely disrupts the connection between plant performance and external weather conditions. For brevity, results related to the winter rice dataset are provided in the Online Resource (Supplementary Information). Nevertheless, this section highlights the differences between the results from the two datasets.
The fit of models on complete data is summarized in Table 4. The models with the main EC effect always had more parameters and lower LogLik than their counterparts, which did not translate into a notable AIC reduction. The RFR had the highest number of parameters, the smallest LogLik, but AIC on par with RRR1 model that approximates RFR with less parameters. The smallest AIC was achieved by RRR2 model, followed by FW1-US and FW2-US.Table 4. Model fit (summer rice) featuring the total number of parameters in the model (including those involved in synthetic covariates), full log-likelihood, and Akaike Information Criterion (AIC). Displayed values were rounded. Baseline – model without genotype-covariate interactions, Kernel – model with an environmental kernel matrix, RRR1 and RRR2 – reduced rank regression of rank one and two with observed covariates, RFR – random factorial regression with observed covariates, FW1-US and FW2-US – random factorial regression with one and two synthetic covariates respectivelyWith the main EC effectWithout the main EC effectModelParametersLogLikAICModelParametersLogLikAICBaseline16−459949Baseline8−465947Kernel17−449931Kernel9−455929RRR124−443933RRR116−450932RRR232−412888RRR224−420887RFR60−407935RFR52−415934FW1-US19−434905FW1-US18−435907FW2-US30−424907FW2-US28−428911
The variance components of different models are summarized as percent change relative to the baseline model in Table 5. All variance component values for summer rice are available in Supplementary Information (Table S2). The reduction in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} components is linked to the incorporation of environmental covariates. However, the substantial reduction observed in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} components is relatively minor compared to the magnitude of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$LY$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} components.Table 5. Variance components percent change relative to the baseline (summer rice). Baseline – model without genotype-covariate interactions, Kernel – model with an environmental kernel matrix, RRR1 and RRR2 – reduced rank regression of rank one and two with observed covariates, RFR – random factorial regression with observed covariates, FW1-US and FW2-US – random factorial regression with one and two synthetic covariates, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L$$\end{document} – location, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y$$\end{document} – year, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document} – genotypeComponentBaseline^a^KernelRRR1RRR2RFRFW1-USFW2-USModels with the main EC effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L$$\end{document} 0.0568−2.5−2.0−5.6−5.578.8−5.8 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y$$\end{document} 0.02623.3−1.99.69.6−9.641.5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document} 0.24220.7−4.7−4.2−5.4−3.4−3.2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$LY$$\end{document} 0.44340.20.30.70.61.0−0.1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} 0.0307−26.7−12.6−27.6−39.3−19.1−29.0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} 0.0142−24.4−16.0−50.4−58.0−31.7−32.8 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} 0.2695−0.10.4−0.6−1.0−1.0−1.1Models without the main EC effect \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L$$\end{document} 0.12690.52.12.63.00.92.4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y$$\end{document} 0.02505.39.828.428.020.218.9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document} 0.24300.8−4.2−3.5−4.7−3.2−2.9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$LY$$\end{document} 0.44410.20.40.80.70.40.7 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} 0.0307−26.5−12.5−28.0−39.6−19.1−29.1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} 0.0142−24.2−16.0−50.1−57.8−31.7−32.9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} 0.2695−0.10.4−0.6−1.0−1.0−1.0^a^represents the actual values from the Baseline model, the rest are in percent relative to it
The results of LOEO and LYLO CVs are reported in Table 6. We also provided the medians of PCC, MSEPD, and MSPE in the Supplementary Information (Table S7), because their distributions are skewed.Table 6. Leave-one-environment-out (LOEO) and leave-one-year-and-location-out (LYLO) cross-validation means (summer rice). PCC – Pearson's correlation coefficient, MSEPD – mean squared error of predicted difference, MSPE – mean squared prediction error, Baseline – model without genotype-covariate interactions, Kernel – model with an environmental kernel matrix, RRR1 and RRR2 – reduced rank regression of rank one and two with observed covariates, RFR – random factorial regression with observed covariates, FW1-US and FW2-US – random factorial regression with one and two synthetic covariatesTypeModelMean PCCMean MSEPDMean MSPELOEOLYLOLOEOLYLOLOEOLYLOWith the main EC effectBaseline0.6180.5890.7530.8070.8871.01Kernel0.6200.5950.7500.7950.8870.95RRR10.6170.5890.7540.8050.8890.961RRR20.6210.5910.7510.8050.8900.967RFR0.6200.5910.7510.8030.8900.965FW1-US0.6220.5910.7500.8040.8720.997FW2-US0.6190.5900.7590.8050.8900.962Without the main EC effectBaseline0.6180.5890.7530.8070.8681.02Kernel0.6200.5950.7500.7950.8681.02RRR10.6170.5890.7540.8050.8721.03RRR20.6210.5920.7510.8050.8741.04RFR0.6200.5910.7510.8020.8741.03FW1-US0.6220.5910.7500.8040.8711.03FW2-US0.6190.5900.7590.8050.8761.03
The models that use EC generally outperform the baseline in both LOEO and LYLO scenarios. This is seen from the mean performance of models displayed in Table 6. Only RRR1 and FW2-US are lagging behind the baseline in a few cases. Notably, all models benefited from inclusion of the mean regression on EC in LYLO CV, but not in LEO CV.
However, if the CV statistics are aggregated as medians (Supplementary Information, Table S7), then some of the trends differ. This is due to the left-skewed distribution of PCC values, and those right-skewed of MSEPD and MSPE. If medians are considered, then the very bad CV folds are discounted, and thus the model ranking changes given overall low magnitude of differences between them.
The same applies to the winter rice dataset (Supplementary Information), except that most models perform worse than the baseline in LYLO CV by both means and medians.
The comparison of model-based MVP and VPD with their CV-based counterparts (MSPE and MSEPD) is given in Tables 7 and 8. The median MVP quite well approximates ranking of models according to the mean MSPE. Mean and median VPD does not differ much, and also approximates CV model performance well except for FW2-US. In the winter rice dataset, despite models not outperforming the baseline, the MVP and VPD estimates favor more complex models.Table 7. Variance of the prediction (MVP) and mean squared prediction error (MSPE) from leave-one-year-and-location-out cross-validation (summer rice). Baseline – model without genotype-covariate interactions, Kernel – model with an environmental kernel matrix, RRR1 and RRR2 – reduced rank regression of rank one and two with observed covariates, RFR – random factorial regression with observed covariates, FW1-US and FW2-US – random factorial regression with one and two synthetic covariatesTypeModelMSPEMVPMeanMedianMeanMedianWith the main EC effectBaseline1.010.6560.9560.930RRR10.960.6600.920.896RRR20.970.6650.930.902RFR0.960.6650.920.896FW1-US1.000.6420.920.919FW2-US0.960.6250.900.898Without the main EC effectBaseline1.020.6440.9410.945RRR11.030.6500.950.950RRR21.040.6720.960.956RFR1.030.6780.950.952FW1-US1.030.6710.940.942FW2-US1.030.6720.940.941Table 8Variance of the predicted difference (VPD) and mean squared error of predicted difference (MSEPD) from the leave-one-year-and-location-out cross-validation (summer rice). Baseline – model without genotype-covariate interactions, Kernel – model with an environmental kernel matrix, RRR1 and RRR2 – reduced rank regression of rank one and two with observed covariates, RFR – random factorial regression with observed covariates, FW1-US and FW2-US – random factorial regression with one and two synthetic covariates. VPD for the baseline model was taken from the standard ASReml-R outputTypeModelMSEPDVPDMeanMedianMeanMedianWith the main EC effectBaseline0.8070.6990.6520.651RRR10.8050.7090.6490.647RRR20.8050.7200.6460.639RFR0.8030.7200.6320.632FW1-US0.8040.7170.6350.629FW2-US0.8050.7160.6270.624Without the main EC effectBaseline0.8070.6990.6520.650RRR10.8050.7110.6490.647RRR20.8050.7200.6440.638RFR0.8020.7200.6300.631FW1-US0.8040.7160.6340.629FW2-US0.8050.7160.6270.624
Discussion
Behavior of the variance components: The inclusion of the main EC effect into the model resulted in reduction of the variance component for location in both datasets. The interpretation would be that EC explain a part of the location variation, but not all of it – in that case the variance component would be zero. Notably, FW1-US, that uses only one synthetic covariate in the fixed part, had smaller reduction of the location variance component. One explanation could be that a single synthetic covariate can capture underlying variation in only one dimension, and thus may lag behind models that use several covariates.
Model performance in different CV scenarios: There is some performance improvement of most models over the baseline, however the magnitude of this improvement is small. This may seem in contradiction with the large amount of the variance components reduction shown in Table 5, however, it must be borne in mind that reduction of variance components depends only on the structure of covariates in relation to the dataset, and does not necessarily show a model’s ability to predict new data points (Sorensen 2023, p. 269). It would be hard to pinpoint the exact reason, but it can be an interplay of lack of fine-scale linkage of covariates to the developmental stages of rice with inability to capture micro-climatic conditions of irrigation in rice using open weather data sources. We hope that with a better data resolution such models would bring a more substantial improvement. The differences between the two CV scenarios also highlight that the validation based on observed EC values primarily implies predictive ability for untested environments from the genotype’s perspective, whereas model performance in unseen environments, which would mean any environment in the future, should include removal of EC information to reflect the uncertainty of the unobserved year.
Fitting RRR and RFR models: RRR and RFR models can have a large number of effects when many EC are fitted, leading to singularities, unstable convergence, or overfitting. Some of the issues can be resolved by getting variance parameters from a related model (e.g., all design-related components, genotypic variance and residual genetic terms \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} *, * \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} ) and using these estimates as starting values. Fitting of RFR can be made easier by pre-computing variance-covariance matrix from a simpler RRR1 model, and using these as starting values of the unstructured variance-covariance matrix. For prediction in an unseen environment (e.g. LYLO CV) one can also remove the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha L$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha Y$$\end{document} residual genetic terms and simply fit \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha LY$$\end{document} , in case there is not much information to estimate the former.
Comparison with other studies: We explored model performance in a setting where exact EC values are unavailable for the next – or generally unseen – year. This scenario is less common than prediction using observed covariates for model validation, as often done with linear models (Hu et al. 2025; Nguyen et al. 2023; Tolhurst 2023) or machine and deep learning models (Yu et al. 2025; Zou et al. 2025). While this provides valuable insight for model comparison and often presents an optimistic view of the potential of EC use, it does not reflect the reality that exact EC values are inherently unknown for future years. Several studies have addressed this issue in different ways – for example, Costa-Neto et al. (2023) used quantile-based long-term, location-level environmental patterns to derive environmental weights, and Gillberg et al. (2019) computed the median of predictions from years with historical weather data to generate predictions for the unseen year. In this context, our work contributes by developing a model-based framework for quantifying prediction uncertainty under unknown future EC values. Regarding predictive performance improvement, although our results are relatively moderate for both LOEO and LYLO cross-validation, many of the cited studies report considerable improvements over their respective baselines, offering a generally optimistic outlook for the future application of EC in MET modeling.
Differences of distributions of model-based and CV-based values: The model-based MVP was close to the mean MSPE, and model-based VPD approached median MSEPD. A slight advantage of more complex models in the winter rice dataset by MVP and VPD can be attributed to the greater reduction in variance components achieved through more complex covariance structures. All the CV-based statistics have skewed distributions with quite a few outliers that negatively influence the mean PCC, MSEPD and MSPE (Supplementary Information, Figs. S2–S5). In addition, the distribution of model-based values has much less spread, which makes it complicated to efficiently compare the two visually. It would be desirable to test the means or medians of the related model-based and CV-based values. This, unfortunately, is not possible using known tests, because the assumption of independence of pairs is violated (Schulz-Kümpel et al. 2024). There are several considerations which could help explain the observed differences between those values (Tables 7, 8). One of them is that we do not account for the uncertainty related to the estimation of variance components. The other one is the fact that the model-based MVP assumes that the fitted model is the correct one. Hence, it misses the bias due to model misspecification (Sorensen 2023), which is a general phenomenon for model-based prediction uncertainty in mixed models.
Differences between the partial C-inverse matrix of ASReml and our full C-inverse matrix: We observed minor numerical differences between the C-inverse obtained using method from Henderson (1984, Chapter 5) and the C-inverse obtained from the ASReml-R output. These small discrepancies are likely due to rounding differences in the underlying matrix computations of ASReml-R and our implementation. Additionally, the C-inverse matrix from ASReml-R contained many zeros, whereas our version was fully populated. A script illustrating the comparison between the two matrices is available in the GitHub repository associated with this paper.
Extension to other sources of data: The model-based method for uncertainty estimation has applications beyond prediction for an unseen year. It is often overlooked that public EC datasets provide interpolated data, which introduces an inherent estimation error that must be accounted for. We therefore hope that in the future, not only EC rasters but also corresponding rasters of EC estimation variance will be made available to enable statisticians to produce more realistic assessments of prediction uncertainty. Another important consideration is that the covariates used in this study are environment-specific. However, it is possible in theory to create genotype-specific covariates based on developmental stages. In such case the covariate vector \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{x}}$$\end{document} has to be indexed by genotypes. RFR, RRR, FW-US models would work in such circumstances, with the only change being the notation. However, it would not be possible to create the kernel matrix in the same way as was suggested here, and a different approach should be implemented (e.g., Jarquín et al. 2014). Another minor modeling change would touch the multivariate model to get \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf\Sigma_{x^{\prime}}$$\end{document} , because the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left(LY\right)}_{lm}$$\end{document} will not be confounded with the residual error, and the latter has to be modeled additionally. Genomic information can be incorporated into all the models through a genomic relationship matrix. The computation of prediction uncertainty is also generalizable to this case.
Conclusions
In conclusion, integrating environmental covariates through regression models can improve the prediction of genotype performance in new environments. Among the methods compared, FW1-US and environmental kernel model emerged as promising models, balancing prediction accuracy and model parsimony. We also demonstrated a novel approach to estimate prediction uncertainty, which can help quantify confidence in varietal recommendations when EC must be estimated. Our findings underscore the need in high-quality, high-resolution environmental data to enable more reliable selection decisions. Future research may explore adding non-linearity in these models or incorporate genomic data to further enhance predictions, but the linear mixed models reviewed here provide a strong and interpretable foundation for EC-based prediction.
Electronic supplementary material
Below is the link to the electronic supplementary material.Supplementary file1 (DOCX 253 KB)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Brien C (2024) asreml Plus: Augments “AS Reml-R” in fitting mixed models and packages generally in exploring prediction differences [Computer software]. https://github.com/briencj/asremlplus
- 2Longford NT (1993) Random coefficient models (1st ed., Vol. 1). Clarendon Press. https://www.biblio.com/book/random-coefficient-models-longford-n/d/1467601876
- 3Sorensen D (2023) Statistical learning in genetics: An introduction using R. Springer International Publishing AG. http://ebookcentral.proquest.com/lib/ubhohenheim/detail.action?doc ID=30749666
