Structural Equation Modeling of Genetic and Residual Covariance Matrices for Multiple-Trait Evaluation in Beef Cattle

Marcos Jun-Iti Yokoo; Gustavo de los Campos; Vinícius Silva Junqueira; Fernando Flores Cardoso; Guilherme Jordão Magalhães Rosa; Lucia Galvão Albuquerque

PMC · DOI:10.3390/ani16050817·March 5, 2026

Structural Equation Modeling of Genetic and Residual Covariance Matrices for Multiple-Trait Evaluation in Beef Cattle

Marcos Jun-Iti Yokoo, Gustavo de los Campos, Vinícius Silva Junqueira, Fernando Flores Cardoso, Guilherme Jordão Magalhães Rosa, Lucia Galvão Albuquerque

PDF

Open Access

TL;DR

This paper explores simpler statistical models for evaluating beef cattle traits, showing they can be as accurate as traditional methods while being more efficient.

Contribution

The study introduces structural equation models as a computationally efficient alternative to traditional methods for estimating genetic covariance in beef cattle.

Findings

01

FA2G and REC1 models showed comparable accuracy to traditional models in estimating breeding values.

02

FA2G reduced the effective number of parameters by 25.3 compared to standard models.

03

REC1 had a lower deviance information criterion than traditional models, making it a competitive alternative.

Abstract

In beef cattle, farmers collect vast amounts of data on many different traits that serve as selection criteria. Therefore, genetic breeding programs must estimate the breeding values by correcting for the relationships that affect the traits throughout the course of causality. For example, a measurement of a particular trait collected at an earlier age may influence another trait that will be collected at a later age. Estimating how these traits are transmitted genetically becomes increasingly complex as the number of traits and records increases. Farmers select animals for multiple traits. Therefore, breeding programs must consider the relationships among these traits. This study aimed to evaluate simpler mathematical methods, known as structural equation models, to assess their accuracy compared to traditional evaluation methods. Data from beef cattle were analyzed, looking at growth…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Bos taurus(bovine · species)

Figures2

Click any figure to enlarge with its caption.

Funding2

—National Council for Scientific and Technological Development of Brazil (CNPq-Brasil)
—São Paulo Research Foundation (FAPESP)

Keywords

Bayesian methodfactor analysisgenetic evaluationmixed modelrecursive relationships

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenetic and phenotypic traits in livestock · Animal Behavior and Welfare Studies · Genetics and Plant Breeding

Full text

1. Introduction

Standard multiple-trait mixed models (SMTM) are widely used in genetic evaluations to estimate unstructured (co)variance matrices across multiple traits [1]. Under an unstructured (co)variance specification, all variances and covariances are freely estimated without imposing mathematical or biological constraints, and no assumptions are made regarding relationships among traits beyond symmetry and positive definiteness. However, the number of (co)variance parameters increases quadratically as the number of traits increases, and high phenotypic correlations may result in near-singular genetic and residual covariance matrices, thereby hindering algorithm convergence and compromising the reliability of statistical inference [1,2,3,4].

Structural equation models (SEM) are often used to obtain more parsimonious (and sometimes more interpretable) parametrization of multivariate models. This approach has been used in quantitative genetics to structure the genetic and environmental covariance matrices of multivariate linear mixed models [1,2,3,4,5,6,7,8], commonly used for genetic evaluations. This is primarily due to its ability to model causal relationships among latent variables, allowing for the modeling of complex phenomena while simultaneously reducing the dimensionality of the data [2,3,6,9]. SEM methodology comprises many different techniques and procedures used together to model covariance structures. These models represent a natural extension of the multivariate linear mixed modeling framework described by Henderson and Quaas [10] through the imposition of structured covariance parameterizations that can express functional networks among traits [11]. Gianola and Sorensen [5] investigated the application of recursive (REC) and simultaneous equation models, both of which are special cases of SEM, applied to phenotypes. De los Campos et al. [2] proposed a methodology to search for recursive and simultaneous effects among phenotypes within a multivariate linear mixed model, which has been applied to various species and traits [2,4]. Some authors discussed the use of REC models operating on genetic covariance matrices [6,12,13]. In certain cases, explicit modeling of covariance matrices facilitates the inference and analysis of causal relationships among breeding values of multiple traits or the evaluation of functions (e.g., ratios) of correlated traits, thereby enabling a reduction in model dimensionality through more parsimonious and efficient parameterizations [4,6], without compromising the accuracy of breeding value estimates [14].

Alternatively, factor analysis (FA), another specific case of SEM, can be used to reduce the dimension of the estimated genetic covariance matrix [3,7] and achieve a more parsimonious model of genetic effects within a multivariate linear mixed model, without decreasing the original number of traits and records. In the FA framework, the observed traits are modeled as linear combinations of fewer unobservable latent factors and trait-specific residual components, effectively capturing the common sources of variation underlying correlated traits while preserving individual trait identities. This approach has been successfully applied in quantitative genetics to structure genetic and environmental covariance matrices [3,8], proving particularly useful when analyzing highly correlated traits where traditional unstructured covariance matrices may suffer from convergence issues or poor estimability. The reduction in the number of parameters not only improves computational efficiency but can also lead to more stable parameter estimates and better model convergence, especially in datasets with limited sample sizes relative to the number of traits being analyzed [3]. Moreover, FA models can provide biological insights by identifying latent factors that may represent underlying biological processes or developmental pathways common to multiple phenotypes.

While previous studies have demonstrated the utility of SEM in various contexts, their application to structuring both genetic and residual covariance matrices independently in beef cattle breeding remains underexplored. In the current omics era, breeding programs are increasingly collecting larger numbers of correlated phenotypes, making computationally efficient, statistically robust methods for multi-trait genetic evaluation even more critical. These more parsimonious models offer practical advantages for routine genetic evaluations in breeding programs, enabling the simultaneous analysis of multiple traits across large populations with improved computational efficiency and numerical stability compared to fully parameterized models. We hypothesized that two different SEMs could be applied simultaneously and independently to the genetic and residual covariance matrices with a comprehensive accuracy using real beef cattle breeding data. Thereby, this study aimed to evaluate the performance of factor analysis and recursive models, applied independently to genetic and residual covariance matrices for growth and ultrasound-measured carcass traits in Nellore cattle, and to compare them with standard multiple-trait mixed models in terms of parameter estimation, breeding value estimation, and model fit.

2. Materials and Methods

2.1. Dataset Description

This research used records from 2942 Nellore cattle collected between 2004 and 2006. The data came from an established database comprising 10 herds enrolled in the Brazilian Nellore Breeding Program across six states. On-farm data collection was performed under standardized management protocols and in accordance with the guidelines of the National Research Council (NRC, [15]). To obtain carcass-related phenotypes without slaughtering the animals, real-time ultrasound technology was used, as described by Duff et al. [16] and Meškinytė et al. [17]. Real-time ultrasound carcass data included longissimus muscle area (LMA, cm^2^, Figure 1), backfat thickness (BF, mm, Figure 1), and rump fat thickness (RF, mm, Figure 2). Additional traits recorded at the same time included body weight (BW, kg) and hip height (HH, cm), as well as scrotal circumference adjusted to 450 days of age (SC, mm). Scrotal circumference adjusted to 450 days of age was obtained as described in the Guidelines for Uniform Beef Improvement Programs [18]. The traits were assessed in animals aged between 480 and 629 days of age.

Contemporary groups (CGs) were defined as animals of the same sex that were born in the same herd, year, and season (dry or rainy) and managed under identical rearing conditions until data recording. A total of 302 contemporary groups were formed, with an average of 9.7 animals per group (range: 3 to 45 animals). Additional effects considered in the analyses included age of animal at scanning (linear covariate for all traits except SC) and age of dam at calving (linear and quadratic covariates for BF, RF, HH, and BW; classes for LMA and SC). A description of the traits and data structure for Brazilian Nellore cattle is presented in Table 1.

2.2. Standard Multi-Trait Mixed Model

All models were evaluated using Bayesian inference implemented via Gibbs sampling. In the sections that follow, we first describe the SMTM commonly used in genetic evaluations, and subsequently present model extensions incorporating FA and REC structures.

In the SMTM, a sire model was fitted for p traits measured on n individuals. The mixed-model equation can be described as:

[eqn]

where $[eqn]$ is the vector of phenotypic observations, $[eqn]$ is the vector of fixed effects, $[eqn]$ is the vector of random additive sire effects, $[eqn]$ is the vector of residual effects, and $[eqn]$ and $[eqn]$ are the corresponding incidence matrices. Under the Bayesian framework, the likelihood function was specified as:

[eqn]

with prior distributions $[eqn]$ and $[eqn]$ , where $[eqn]$ is the $[eqn]$ additive genetic (co)variance matrix among sire effects, $[eqn]$ is the $[eqn]$ residual (co)variance matrix, $[eqn]$ is Wright’s numerator relationship matrix among sires, $[eqn]$ is an identity matrix of order n, and ⊗ denotes the Kronecker product operator.

Prior distributions for model parameters were specified as follows. Systematic effects included contemporary groups, animal age at scanning (linear effect, except for SC), and dam age (linear and quadratic effects for BF, RF, HH, and BW). For the vector of systematic effects $[eqn]$ with vague information, prior distributions were assigned as:

[eqn]

For the (co)variance matrices, scaled inverse Wishart distributions were assigned as prior distributions:

[eqn]

where $[eqn]$ and $[eqn]$ are $[eqn]$ scale matrices and $[eqn]$ and $[eqn]$ are degrees of freedom parameters. The joint posterior distribution of all model parameters is proportional to the product of the likelihood and prior distributions:

[eqn]

2.3. Factor Analytic Model

Factor analysis is a multivariate statistical technique that simplifies complex and correlated data structures [19,20]. It can be viewed as an extension of principal component analysis, where the factor-analytic variance-covariance structure serves as a parsimonious approximation to a fully unstructured variance-covariance matrix, enabling more efficient parameterization of complex correlation patterns. In a standard FA, a vector of random variables (e.g., $[eqn]$ , representing breeding values) is decomposed into a linear combination of a reduced number of unobservable random variables called common factors ( $[eqn]$ ), with an unobservable incidence matrix ( $[eqn]$ ) of factor loadings, plus a vector of trait-specific errors ( $[eqn]$ ) to each individual i that represent the lack of fit of the model. In compact notation:

[eqn]

In vector notation, the factor analytic model for these effects can be written as:

[eqn]

where $[eqn]$ , $[eqn]$ , and $[eqn]$ . This results in the following covariance matrix of $[eqn]$ :

[eqn]

where $[eqn]$ is a diagonal matrix of specific variances. Likewise, the residual vector ( $[eqn]$ ) can be decomposed into common and trait-specific factors. Consequently, the marginal distributions of $[eqn]$ and $[eqn]$ are:

[eqn]

where “iid” stands for “independent and identically distributed”.

To model the additive genetic and residual (co)variance matrices using FA, it was assumed that Equation (2) holds for the vector of random additive genetic effects ( $[eqn]$ ) in Equation (1) and, likewise, for the vector of random residual effects ( $[eqn]$ ) such that:

[eqn]

where $[eqn]$ , $[eqn]$ , $[eqn]$ and $[eqn]$ are as before and $[eqn]$ and $[eqn]$ are interpreted as vectors of common and specific additive genetic effects, respectively, similarly for the random residual effects ( $[eqn]$ ). Combining the assumptions of the FA model described above with those of SMTM leads to the following random effects joint distribution,

[eqn]

where $[eqn]$ and $[eqn]$ are the matrices of additive genetic and residual factor loads, respectively, $[eqn]$ and $[eqn]$ are the diagonal matrices of specific additive genetic and residual variances, respectively. More details on the implementation of this model can be found in [8].

2.4. Recursive Models

Recursive models are a category of SEMs that postulate causal and unidirectional relationships between latent variables, unlike simultaneous models, which admit mutual feedback effects between traits. For more information on the procedure for REC effects decomposition, see [2]. Using REC, the vector $[eqn]$ can be decomposed as:

[eqn]

where $[eqn]$ is the random effect of the ith sire, $[eqn]$ is a strictly lower-triangular matrix; i.e., $[eqn]$ for all $[eqn]$ , whose non-zero entries define recursive effects, and $[eqn]$ is a vector of random effects whose (co)variance matrix is: $[eqn]$ where $[eqn]$ is a diagonal matrix. The reduced form model of Equation (5) is:

[eqn]

Similarly to the FA model, to implement the REC to model the additive genetic and the residual (co)variance matrices, it was assumed that Equation (6) holds for the vector of random additive genetic effects ( $[eqn]$ ) in Equation (1), likewise for the vector of random residual effects ( $[eqn]$ ). Therefore, the additive genetic covariance matrix is equal to $[eqn]$ , as $[eqn]$ . Similarly, using REC for residuals, $[eqn]$ can be represented as $[eqn]$ , where $[eqn]$ is a strictly lower-triangular matrix defining recursive effects between model residuals, and $[eqn]$ is diagonal. A fully recursive model (FRM) occurs when all lower-triangular entries of $[eqn]$ and of $[eqn]$ are non-zero, i.e., all $[eqn]$ and $[eqn]$ with $[eqn]$ are parameters to be estimated. This model has as many (co)dispersion parameters as the SMTM, and there is a one to one map from the unknowns in $[eqn]$ to those in $[eqn]$ . Therefore, the FRM is just a re-parameterization of SMTM; however, various degrees of parsimony can be obtained by setting some of the lower-triangular entries of $[eqn]$ and $[eqn]$ to zero. Consequently, the models evaluated in this study differed in the structural specification of the genetic and residual (co)variance matrices, with distinct sets of constraints imposed on each, as detailed below.

2.5. Statistical Analyses

Bayesian inference and MCMC implementation: All genetic parameters were estimated using a Bayesian inference framework implemented via Gibbs sampling, as performed using the MTM package [21] in the R statistical computing environment [22]. An initial chain of 1,500,000 iterations was run to assess convergence. The final chain length, burn-in period, and thinning interval were determined using the criteria of [23], implemented in the Bayesian Output Analysis (BOA) package [24]. Convergence was further evaluated through visual inspection of trace plots for variance components. Based on these diagnostics, posterior inference was based on 146,000 samples obtained after discarding the first 40,000 iterations as burn-in and retaining every 10th iteration thereafter.

Model comparison: Multiple alternative models were fitted to assess the effectiveness of parsimonious covariance structures. The SMTM with unstructured (co)variance matrices served as the baseline, requiring estimation of 42 (co)variance parameters (21 for $[eqn]$ and 21 for $[eqn]$ ).

Factor analytic models: Given the limited number of traits (six), only models with two or three latent factors were evaluated for each (co)variance matrix; consequently, four factor-analytic (FA) models were assessed. The FA2F model specified two common factors for both additive genetic and residual (co)variance matrices, reducing the total number of parameters to 24 (12 genetic + 12 residual). The FA3F model extended this to three factors, increasing the parameter count to 36 (18 genetic + 18 residual). To evaluate the benefit of parsimony in only one (co)variance component, two additional models were fitted: FA2G applied two factors to $[eqn]$ while maintaining the unstructured $[eqn]$ (33 parameters total), whereas FA2R applied two factors to $[eqn]$ while maintaining the unstructured $[eqn]$ (33 parameters total).

Recursive models: In total, three REC were evaluated. First, a fully recursive model (FRM) was fitted, which is equivalent to the SMTM in terms of the number of parameters but reparameterizes the covariance structures via recursive pathways. Based on the results obtained from the FRM, and with the objective of reducing the number of estimated parameters, two additional reduced models were subsequently constructed by imposing zero constraints on selected recursive effects. The REC1 model set the $[eqn]$ th entry of $[eqn]$ to zero if $[eqn]$ or $[eqn]$ , where $[eqn]$ and $[eqn]$ denote the posterior mean and standard deviation of the $[eqn]$ th entry of $[eqn]$ from FRM, respectively. This criterion resulted in six recursive effects being removed from $[eqn]$ . The REC2 model removed any effect in $[eqn]$ or $[eqn]$ whose posterior mean from FRM had absolute value less than 0.15 (an arbitrary threshold chosen to evaluate a more parsimonious structure), resulting in removal of 15 recursive effects (10 from $[eqn]$ and 5 from $[eqn]$ ).

Model selection criteria: Considering that the SMTM represents the standard benchmark for estimating genetic values, models were compared using the deviance information criterion (DIC; [25,26]), the posterior mean of the log-likelihood [Mean( $[eqn]$ )], and the effective number of parameters ( $[eqn]$ ; [25,26]). Lower DIC values indicate better model fit penalized for complexity, while $[eqn]$ quantifies model complexity. In addition, Spearman’s rank and Pearson’s product–moment correlations between estimated breeding values (EBV) for the same traits obtained from the SMTM and alternative models (REC and FA) were computed to evaluate the consistency of animal ranking across modeling approaches.

3. Results

3.1. Genetic Parameters

Table 2, Table 3 and Table 4 report estimates of heritabilities (diagonal elements), genetic correlations (above the diagonal), and residual correlations (below the diagonal) assessed from the SMTM, FA2G, and REC1 models, respectively. Across these models, standard errors of heritability estimates ranged between 0.05 and 0.12. Correlations of Spearman and Pearson between EBV for the same traits across models were consistently high, ranging from 0.94 to 1.00.

For the genetic correlations, the standard errors obtained by these three models presented in Table 2, Table 3 and Table 4 ranged from 0.07 to 0.23. Considering the overlap within $[eqn]$ standard errors, differences between genetic correlation estimates obtained by SMTM (Table 2) and the REC1 (Table 4) model were negligible. In contrast, discrepancies between the SMTM (Table 2) and FA2G (Table 3) models were observed for LMA and BW, BF and RF, BF and HH, RF and HH, and BW and HH.

For the residual correlations, the standard errors obtained by these three models presented in Table 2, Table 3 and Table 4 ranged from 0.01 to 0.07. Considering these standard error ranges, differences in residual correlation estimates obtained with the SMTM (Table 2) and REC1 (Table 4) models were observed only for SC and BF. The differences between the residual correlation estimates obtained by the SMTM (Table 2) and FA2G (Table 3) models were not significant.

3.2. Recursive Effects

The ordering of traits within the model directly determines the identifiability of causal effects. For example, a recursive effect of adult weight acting on weaning weight for the same individual would likely not be identifiable. Because all traits analyzed in this study were recorded on the same day, their expression was observed simultaneously, which facilitates the specification of admissible causal relationships. Accordingly, estimates of genetic and residual recursive effects obtained from the FRM in Nellore cattle are presented in Table 5, under the assumption that causal effects flow from left to right (i.e., from the trait in the first column to subsequent traits). A variable may exert or receive causal influence (directly or indirectly) from another variable that precedes it in the ordering; however, such relationships represent simultaneous effects rather than recursive effects and are therefore not considered in the present study.

3.3. Model Comparison

The DIC, the Mean(L), and the $[eqn]$ used to compare the proposed models are shown in Table 6. Although the SMTM model is considered fully parameterized for the additive genetic and residual matrices, the $[eqn]$ were largest in models that have many (co)dispersion parameters, as in the FA3F and FA2R, compared to the SMTM model; thus, the model with the smallest $[eqn]$ was FA2G.

The models with the best (smallest) DIC and Mean(L) were REC1 and SMTM, respectively. Conversely, FA3F was the worst model for these three comparison criteria (DIC, Mean(L), and $[eqn]$ ).

4. Discussion

This study aimed to evaluate the potential advantages of using structured equation models, such as recursive and factor-analytic models, for sire modeling in beef cattle genetic evaluations of ultrasound carcass and growth traits. Given the magnitude of standard errors, differences between heritability estimates obtained by SMTM and the two best models (REC1 and FA2G) were only observed for BF and RF traits in the FA2G model, which yielded slightly lower estimates, except for SC in the REC1 model, while exhibiting the same pattern and remaining within one standard deviation. The EBVs estimated under the different models were highly consistent, with Pearson and Spearman correlation coefficients approaching 1.0 for each trait, indicating that selection decisions would be essentially unchanged using any of the methodologies considered here, despite the lower genetic variability (heritability estimates) of some traits in some models. Under a genetic evaluation framework adopting these SEM approaches to estimate breeding values, the corresponding variance components would be fixed using the Best Linear Unbiased Prediction (BLUP). Overall, the SMTM more effectively captured the heritability estimates for carcass traits measured by ultrasound (LMA, BF, and RF), as reported in Table 2, compared with the corresponding estimates presented in Table 3 and Table 4. In contrast, for the remaining growth-related traits (BW, HH, and SC), the REC1 model yielded superior performance in estimating heritabilities. This information is important for guiding the choice of the genetic evaluation model, with the objective of maximizing selection accuracy. Moreover, these estimates are consistent with those reported in previous studies [16,27].

The FA2G model estimated genetic correlations with the same pattern as the SMTM model, but with a lower magnitude. High genetic correlations among traits imply a genetic covariance matrix that is close to rank deficient, which often results in convergence difficulties, imprecise estimation of factor loadings, and redundancy among latent factors [3,19]. Consequently, FA models are expected to provide reliable approximations of the underlying covariance structure primarily when genetic correlations are of moderate or low magnitude, as discussed by [28]. Conversely, most genetic correlations estimated using the REC1 model were of slightly greater magnitude than those obtained with the SMTM, but yet very similar, indicating a good alternative for modeling additive genetic covariance matrices in beef cattle.

As expected, the residual covariance matrices estimated under the REC1 and FA2G models closely resembled those obtained under the SMTM, except for the association between SC and BF in the REC1 model. Nevertheless, the residual correlation in the REC1 model was basically null $[eqn]$ . De los Campos et al. [2] likewise did not observe significant changes in covariance estimates when applying REC or simultaneous models. Considering the reasonable agreement between covariance parameters and breeding values estimated by the REC1 and FA2G models with the SMTM, it could be suggested that these structural equations can be successfully used to reduce dimensionality of genetic evaluations, optimizing processing by a single analysis including all traits of interest and, consequently, providing faster and reliable estimation of breeding values compared to the traditional univariate analyses for each selection criteria.

Recursive models act separately on $[eqn]$ and $[eqn]$ , allowing specific patterns to be captured in each component (Table 3). Differences in the patterns of genetic and residual correlations can be interpreted in light of the estimated recursive effects, and based on this information, irrelevant effects can be suppressed from the covariance matrices [2,5,6,12]. The six recursive effects zeroed out in $[eqn]$ for the REC1 model were between RF and LMA, SC and LMA, BW and RF, HH and RF, SC and RF, and SC and HH. Consequently, the REC1 model excluded six recursive effects that were effectively negligible among the 21 parameters initially considered, indicating that these effects do not materially contribute to the estimation of breeding values for the traits involved. This likely reflects the fact that these residual causal effects do not operate at the genetic level nor directly influence trait expression. These results suggest that, when residual causal effects are close to zero, this class of SEM can facilitate the joint estimation of breeding values for multiple traits within a single analytical framework.

In terms of the $[eqn]$ criterion, all recursive models (FRM, REC1, and REC2) had smaller $[eqn]$ values than the SMTM model; even for FRM, which has the same dimension as the SMTM, some benefits were observed when using this kind of structural equation model. For the FA models, $[eqn]$ favored the FA2G (two factors) model over all other FA and REC models. However, when we tested a model with three factors (FA3F), the $[eqn]$ was the largest. This was expected due to the number of parameters (p) of $[eqn]$ in the FA, which can be calculated as $[eqn]$ parameters, where $[eqn]$ are the dimensions of the $[eqn]$ matrix of factor loads (Equation (2)). FA3F has 21 p in each matrix (additive genetic and residual), while FA2F has 17 p. In the SMTM, the p of $[eqn]$ can be calculated as $[eqn]$ parameters, where $[eqn]$ are the dimensions of the $[eqn]$ matrix, thus yielding 21 p in each matrix (additive genetic and residual) in the SMTM model. Depending on the data set, i.e., the number of records, a greater number of traits, and the most correlated traits, the FA models may be used as a special case of SEM to reduce the covariance matrices’ rank in the model and provide a more parsimonious estimation of genetic parameters compared to SMTM. In this paper, the FA3F model had a poorer fit to the data than the other models because this dataset has only six traits. Nevertheless, if the dataset had more traits, a model with more factors would probably work better at reducing the $[eqn]$ . Normally, increasing the number of factors in the model improves the representation of relationships among traits; however, when the number of traits is small, only a limited number of factors can be reliably estimated. In the present study, the analysis was restricted to six traits, and the number of records for SC was limited because this trait is recorded only in males ( $[eqn]$ ). This could at least partially explain the worst performance in terms of DIC, $[eqn]$ , and Mean(L) of the FA3F model. As discussed by [28,29], the FA model usually reduces the rank of covariance matrices. Still, the $[eqn]$ matrix has to be close to zero $[eqn]$ , when specific effects are assumed absent. This leads to a mixed-model formulation characterized by covariance matrices of reduced rank.

In the same sense, the Mean(L) and DIC demonstrated considerably better fits to the data for the models SMTM and REC1 than for all the other FA and REC models. Since DIC also accounts for dimensionality through a penalty ( $[eqn]$ ), the REC1 model may represent a viable alternative to the SMTM for beef cattle genetic evaluation procedures involving multi-trait analyses.

In general, even though FA2G was effective in terms of $[eqn]$ , comparing all the criteria and parameters, the REC1 model proved to be a suitable alternative modeling approach for this type of dataset (e.g., six traits) because it had a smaller $[eqn]$ compared to the SMTM model, better DIC, and a reasonable Mean(L) compared with all other models tested. This model, when applied to the residual effect, can be used to obtain more parsimonious representations of the residual covariance matrix. This reduces the number of parameters to be estimated, and may provide useful insight into the underlying patterns of (co)variation. Using complex traits (i.e., ratio), Jamrozik et al. [4] also demonstrated the advantages of applying REC models in multiple-trait evaluation. The results of the model comparison criteria demonstrate that growth and ultrasound-measured carcass traits in beef cattle can be effectively analyzed using REC for the estimation of breeding values and the inference of putative causal relationships among latent variables, allowing the representation of complex biological processes while simultaneously reducing the dimensionality of the data.

Breeding values estimated under FRM or REC differ conceptually from those obtained under SMTM. Under FRM or REC, breeding values represent direct genetic effects on trait expression without accounting for causal genetic effects from other traits included in the model. In contrast, breeding values derived from SMTM reflect total genetic effects, incorporating potential causal relationships among traits. Consistent with this interpretation, analysis of the biological meaning and causal structure of the recursive relationships estimated in the genetic matrix (Table 5) reveals a strong genetic influence of BW on HH. This result is biologically plausible, as many genes influencing BW also affect frame size [27], for which HH serves as an indicator trait. The resulting genetic architecture poses a challenging selection scenario: because BW and HH are positively correlated, achieving simultaneous gains in BW while maintaining or reducing HH is difficult. Moderate causal effects were also detected of BF on RF, indicating heterogeneity in subcutaneous fat deposition; and of LMA on BF, BW, and HH, suggesting that unfavorable genetic potential for muscle development adversely affects fat deposition, body weight, and skeletal growth. To maximize genetic gain and improve selection accuracy across multiple traits, the use of selection indices represents an effective alternative [30].

For the recursive relationships estimated in the residual matrix from the FRM (Table 5), moderate effects are observed of RF to BF; of BW to LMA; of HH to BW, and of SC to BW. These residual-level relationships indicate that environmental or management factors affecting one trait may exert correlated effects on others.

From a quantitative genetics perspective, RECs provide a coherent framework for representing causal relationships among traits because they enable explicit modeling and interpretation of the biological mechanisms underlying trait expression. Their routine use, however, is constrained by high computational demands that increase with model dimensionality and by the need for reliable prior knowledge of the causal structure [6,12], as misspecification may lead to biased estimates and misleading genetic inferences (e.g., assuming a causal effect of adult weight on weaning weight). Robust estimation further requires high-quality phenotypic records collected on the same individuals, with a large number of measurements, even if it is a trait measured in only one sex, such as SC. Despite these limitations, the models evaluated here showed satisfactory performance. Further studies involving more traits and breeds are warranted, particularly to assess computational efficiency. A key advantage of the SEM approach in this study is its capacity to specify structural assumptions independently for the genetic and residual (co)variance matrices, using either REC or FA parameterizations. The REC structure enables the estimation of genetic causal effects independently of residual covariances and allows the exclusion of negligible causal pathways, whereas the FA structure facilitates breeding value estimation by reducing the dimensionality of the genetic covariance matrix relative to fully unstructured models, provided that model fit is improved.

5. Conclusions

The results of this study demonstrate that growth and ultrasound-measured carcass traits in Nellore beef cattle can be effectively analyzed using factor-analytic and/or recursive models applied independently to the genetic and residual covariance matrices, enabling the joint estimation of breeding values for multiple traits within a single analytical framework. Depending on the size of the dataset, particularly the number of records and traits, factor analytic and recursive structures, applied either independently or jointly, can substantially reduce the effective rank of the covariance matrices and yield more parsimonious and stable estimates of genetic parameters than standard multiple-trait mixed models, especially when covariances among traits are of low magnitude.

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Meyer K. Genetic principal components for live ultrasound scan traits of Angus cattle Anim. Sci.20058133734510.1079/ASC 50850337 · doi ↗
2De los Campos G. Gianola D. Heringstad B. A structural equation model for describing relationships between somatic cell score and milk yield in first-lactation dairy cows J. Dairy Sci.2006894445445510.3168/jds.S 0022-0302(06)72493-617033034 · doi ↗ · pubmed ↗
3Meyer K. Factor-analytic models for genotype× environment type problems and structured covariance matrices Genet. Sel. Evol.2009412110.1186/1297-9686-41-2119284520 PMC 2674411 · doi ↗ · pubmed ↗
4Jamrozik J. Johnston J. Sullivan P. Miglior F. Recursive model approach to traits defined as ratios: Genetic parameters and breeding values J. Dairy Sci.20171003767377210.3168/jds.2016-1217728284690 · doi ↗ · pubmed ↗
5Gianola D. Sorensen D. Quantitative genetic models for describing simultaneous and recursive relationships between phenotypes Genetics 20041671407142410.1534/genetics.103.02573415280252 PMC 1470962 · doi ↗ · pubmed ↗
6Varona L. González-Recio O. Invited review: Recursive models in animal breeding: Interpretation, limitations, and extensions J. Dairy Sci.20231062198221210.3168/jds.2022-2257836870846 · doi ↗ · pubmed ↗
7Jöreskog K.G. A general method for analysis of covariance structures Biometrika 19705723925110.1093/biomet/57.2.239 · doi ↗
8De los Campos G. Gianola D. Factor analysis models for structuring covariance matrices of additive genetic effects: A Bayesian implementation Genet. Sel. Evol.20073948110.1186/1297-9686-39-5-48117897592 PMC 2682801 · doi ↗ · pubmed ↗