# Inferring the Partial Correlation Structure of Allelic Effects and   Incorporating it in Genome-wide Prediction

**Authors:** Carlos A. Mart\'inez, Kshitij Khare, Syed Rahman, Mauricio A. Elzo

arXiv: 1705.02026 · 2017-05-08

## TL;DR

This paper introduces novel statistical methods for genome-wide prediction that infer and incorporate the partial correlation structure of marker effects, improving understanding of genetic relationships and prediction accuracy.

## Contribution

The paper develops Bayesian and frequentist methods based on Gaussian models to estimate the partial correlation structure of marker effects for genome-wide prediction, a novel approach in this context.

## Key findings

- Methods accurately recover partial correlation structure and precision matrix.
- CONCORD-EM and Bayes G-Sel outperform others in structure estimation.
- Proposed methods enhance biological understanding and prediction of genetic traits.

## Abstract

In this study, we addressed the problem of genome-wide prediction accounting for partial correlation of marker effects when the partial correlation structure, or equivalently, the pattern of zeros of the precision matrix is unknown. This problem requires estimating the partial correlation structure of marker effects, that is, learning the pattern of zeros of the corresponding precision matrix, estimating its non-null entries, and incorporating the inferred concentration matrix in the prediction of marker allelic effects. To this end, we developed a set of statistical methods based on Gaussian concentration graph models (GCGM) and Gaussian directed acyclic graph models (GDAGM) that adapt the existing theory to perform covariance model selection (GCGM) or DAG selection (GDAGM) to genome-wide prediction. Bayesian and frequentist approaches were formulated. Our frequentist formulations combined some existing methods with the EM algorithm and were termed Glasso-EM, CONCORD-EM and CSCS-EM, whereas our Bayesian formulations corresponded to hierarchical models termed Bayes G-Sel and Bayes DAG-Sel. Results from a simulation study showed that our methods can accurately recover the partial correlation structure and estimate the precision matrix. Methods CONCORD-EM and Bayes G-Sel had an outstanding performance in estimating the partial correlation structure and a method based on CONCORD-EM yielded the most accurate estimates of the precision matrix. Our methods can be used as predictive machines and as tools to learn about the covariation of effects of pairs of loci on a given phenotype conditioned on the effects of all the other loci considered in the model. Therefore, they are useful tools to learn about the underlying biology of a given trait because they help to understand relationships between different regions of the genome in terms of the partial correlations of their effects on that trait.

---
Source: https://tomesphere.com/paper/1705.02026